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(54) C^odebook gain attenuation during frame erasures 



(57) A codebook-based speech decoder which fails 
to receive reliably at least a portion of a current frame of 
compressed speech information uses a codebook gain 
which is an attenuated version of a gain from a previous 
frame of speech. The speech decoder Includes a code- 
book memory and a signal amplifier. The memory and 
amplifier are used in generating a decoded speech sig- 
nal based on compressed speech information. The 




conpressed speech information includes a scale-factor 
for use by the amplifier in scaling a codebook vector. 
When a frame erasure occurs, a scale-factor con-e- 
sponding to a previous frame of speech is attenuated 
and the attenuated scale factor is used to amplify the 
codebook vector con-espondir^ to the current erased 
frame of speech. 

FIG. 1 
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Description 

Field of ^he Invention 

5 The present invention relates generally to speech coding arrangements for use in communication systems, and 

more particularly to the ways in which such speech coders function in the event of burst-like errors in transmission. 

Rack qround of the Invention 

10 Many communication systems, such as cellular telephone and personal communications systems, rely on wireless 
channels to communicate information. In the course of communicating such information, wireless communication chan- 
nels can suffer from several sources of enror, such as multipath fading. These en-or sources can cause, among other 
things the prot)lem of frame erasure. Erasure refers to the total loss or whole or partial corruption of a set of bits com- 
municated to a receiver. A frame is a predetermined fixed number of bits which may be communicated as a block 

75 through a communication channel. A frame may therefore represent a time-segment of a speech signal. 

If a frame of bits is totally lost, then the receiver has no bits to interpret Under such circumstances, the receiver 
may produce a meaningless result If a frame of received bits is corrupted and therefore unreliable, the receiver may 
produce a severely distorted result In either case, the frame of bits may be thought of as "erased" in that the frame is 
unavailable or unusable by the receiver. 
20 As the demand for wireless system capacity has increased, a need has arisen to make the best use of available 
wireless system bandwidth. One way to enhance the efficient use of system bandwidth is to employ a signal compres- 
sion technique. For wireless systems which can-y speech signals, speech compression (or speecti coding) techniques 
may be enployed for this purpose. Such speech coding techniques include analysis-by-synthesis speech coders, such 
as the well-knowvn Code-Exdted Unear Prediction (or CELP) speech coder. 

25 The problem of packet loss In packet-switched networks employing speech coding arrangements is very similar to 
frame erasure in the wireless context. That is. due to packet loss, a speech decoder may either fail to receive a frame 
or receive a frame having a signif icant number of missing bits. In either case, the speech decoder is presented with the 
same essential problem - the need to synthesize speech despite the loss of compressed speech information. Both 
"frame erasure" and "packet loss" concern a communication channel (or network) problem which causes the loss of 

30 transmitted bits. For purposes of this desaiption. the temi "frame erasure" may be deemed to include "packet loss." 

Among other things. CELP speech coders employ a codebook of excitation signals to encode an onginal speech 
signal. These excitation signals, scaled by an excitation gain, are used to "excite" f Oters which synthesize a speech sig- 
nal (or some precursor to a speech signal) in response to the excitation. The synthesized speech signal is compared to 
the original speech signal. The codebook excitation signal is klentified which yields a synthesized speech signal which 

35 most closely matches the original signal. The identified excitation signal's codebook index and gain representation 
(which is often itself a gain codebook index) are then communicated to a CELP decoder (depending upon the type of 
CELP system, other types of information, such as linear prediction (LPC) filter coefficients, may be communicated as 
well). The decoder contains codebooks identical to those of the CELP coder. The decoder uses the transmitted indices 
to select an excitatton signal and gain value. This selected scaled excitation signal is used to excite the decoder s LPC 

40 filter. Thus exerted, the LPC filter of the decoder generates a decoded (or quantized) speech signal - the same speech 
signal which was previously determined to be closest to the original speech signal. 

Some CELP systems also employ other conponents, such as a periodicity model {e.g., a pitcti-predictive filter or 
an adaptive codetx)ok). Such a model simulates the periodicity of voiced speech. In such CELP systems, parameters 
relating to these components must also be sent to the decoder. In the case of an adaptive codebook. signals represent- 

45 ing a pitch-period {delay) and adaptive codebook gain must also be sent to the decoder so that the decoder can recre- 
ate the operation of the adaptive codebook in the speech synthesis process. 

Wireless and other systems which employ speech coders may be more sensitive to the protilem of frame erasure 
than those systems which do not congress speech. This sensitivity is due to the reduced redundancy of coded speech 
(compared to uncoded speech) making the possMsle loss of each transmitted bit more significant. In the context of a 

so CELP speech coders experiencing frame erasure, excitation signal codebook indices and other signals representing 
speech in the frame may be either lost or substantiafiy corrupted preventing proper synthesis of speech at the decoder. 
For exarrple. because of the erased fram6(s). the CELP decoder will not be able to reliably Wentify which entry in its 
codebook should be used to synthesize speech. As a result, speech coding system performance may degrade signifi- 
cantly. . 

55 Because frame erasure causes the loss of excitation signal codebook indicies, LPC coefficients, adaptive code- 
book delay infonmtion. and adaptive and fixed codebook gain information, normal technique for synthesizing an exa- 
tation signal in a speech decoder are ineffective. Therefore, these nomnal techniques must be replaced by alternative 
measures. 
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Summary of the Irwy n^f^n 

»«J!;!„rr^ r'^'"" ^^^^^ °* <«lebook gain infomiation during frame erasure In 

«^ f ^ 2f ^'^^"^ ^ «'debook-based speech decoder which fails to receive reliably at least a por- 

ton o^ a current frame of compressed speech information uses a codebook gain which is an attenuated version S^a 
gain from a previous frame of speech. version oi a 

^ion^?ISf'S'^T^"*°^'"'^"* ^'^'^ \ru^on is a speech decoder which includes a codebook memory and a 
^ ifC^ .^^"^ ^ ^"^"''^ ^® Senerafng a decoded speech signal based on confessed 
"^TJ^J^^ compressed speech information includes a scale-factor for use by the amplifierln^ing a 
^ ^ ^'^'^ ^ scale-factor corresponding to a previous frame of speech is attenu- 

ated and the attenuated scale factor is used to amplify the codebook vector corresponding to the current erased frame 
^^iSSnSSt'h*"^. ^ °» "mention are preyed in section ILD^IS!^ 

• invention is applicable to botti fixed and adaptive codebook processing, and also to systems which 

^T^TT °' "^'"""^ ^ Piteh^,redictive filter) between a ccSebook and its^Ter s2 

section ll.ai of the Detailed Description for a discussion relating to the present invention. 

Brief Des cription of the Drawinoa 

Figure 1 presente a block diagram of a Q.729 Draft decoder modified in accordance with the present invention 
tion Si^e^ r^rfl '^^ communication system employing the embodiment of the present inven- 

Detailed Descrlptinn 
I. Introduction 



The presentinventon concerns the operation of a speech coding system experiencing frame erasure - that is the 
^ I^Sn ^T'^ '^"^ f "P^«sed bit-stream, which group is oalinarily used to synfliesize ^^S, 
i!!^ concerns features of the present invention applied illustratively to an 8 kbitfeCELP 

WCh coding system proposed to the ITU for adoption as its international standard G.729. For ttie convenieroe of the 
'ff^^Tu™'^^ '"^^ recommendation for tiie G.729 standard is attached hereto as an Appendix (the St ,Sl be 
refened to herein as Je "G.729 Draft"). The G.729 Draft includes detailed descriptions ofT^SS^ e^SSeT iJS 
decoder (see G.729 Draft sections 3 and 4. respectively). The fliustrative embodiment of the^resent i?So^ 

to 212jr""^°"'^ T^, ° ^ ""^^^ G-^29 Draft section /s.^mS^S,^ 

to the encoder are required to implement the present invention. ««iiwuiuir. 

th. JT® applicability of the present invention to the proposed a729 standard notwithstanding, those of ordinary skill in 

V ^^'^^ '^'^ °^ P"^^"* applicability to other speech coding systen^ 
inJSTS^l 1 """^ ^ '"P"* ^'9"^- ^- ^ inustrative embodiment of the present 

in any Of the conventional ways well-known in the art For example whole 
L'^fn^'Tir'^ir^ "Z*"^ ^'^'^^ the use Of a conventional enor detection code. vSS 

determined to have been erased, e = 1 and special procedures are initiated as described below Otherariseitnrt 
rJ^!l^^i ^^"T^' Conventional error protection codes could be inplemented as p^ of a 

conventional radio transmissionAeception subsystem of a wireless communication system va. m 

In addition to the application of the full set of remedial measures applied as the resuH of an erasure (e = 1) the 
nSfJi ^TS^^ l"*^ °* "^^^""^ « '^^^^^ A parity bit is computed based on tiie 

f^^^ I^Tk?* ^'''If '"^'"^ of a frame of coded speech. See G.729 Draft SecSn 3.7.1 . TKs^^ 
TJI^^^^^T^ ^ ^^"^ ^ '^^^ fr"" «"°°der. If ttie two parity biteara 

50 Sly fs!^ ^ « sad to be con.,«ed (PE = 1 . in the embodiment) and special processi-Wthe pitch 

i '^.Z'*^^ «jlanation, the illustrative embodiment of the present invention is presented as connrising individual 
functional btocte. The functions these blocks represent may be provided tiirough thfuse of either sJ^ed o?dSSt^ 
hardware, including, but not limited ta hardware capable of executing softwara For example, the blocks presented in 
R^re 1 may be provided by a single shared processor. (Use of the term ^Jrocessor" shouW not be conslSiSTto^ef^ 
exclusively to hardware capable of executing software.) ^^Rwueu to reier 

DQMP?^!!!,!'?*^'""'^ "^"^ ^'3"^ processor (DSP) hardware, such as the AT&T DSP16 or 

DSP32C, read-only memory (ROM) for storing software performing the operations discussed below and random 

vlT'' "Z^""^. "^'^ ^"^^ "^^S^"" t^LSO hardware embodir^^^ 

custom VLSI circuitry m combmataon with a general purpose DSP circuit may also be provided. 
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II. An Illustrative Embodiment 

Figure 1 presents a block diagram of a G.729 Draft decoder modified in accordance with the present invention (Fig- 
ure 1 is a version of figure 3 of the G.728 standard draft which has been augmented to more clearly illustrate features 
of the claimed invention). In normal operation (/.e.. without experiencing frame erasure) the decoder operates in 
accordance with the G.729 Draft as described in sections 4.1 • 4.2. During frame erasure, the operation of the embod- 
iment of Figure 1 is augmented by special processing to make up for the erasure of information from the encoder. 

A. Normal Decoder Operation 

The encoder described in the G.729 Draft prcvkJes a frame of data representing compressed speech e^ery 1 0 ms. 
The frame comprises 80 bits and is detailed in Tables 1 and 9 of the G.729 Draft Each 80-brt frame of compressed 
speech is sent over a communication channel to a decoder which synthesizes a speech (representing two subframes) 
signals based on the frame produced by the encoder. The channel over which the frames are communicated (not 
shown) may be of any type (such as conventional telephone networks, packet-based networks, cellular or wireless net- 
works. ATM networks, etc.) and/or may comprise a storage medium (such as magnetic storage, semiconductor RAM 
or ROM. optical storage such as CD-ROM, etc.). ^ i. . ,rr^r>x 

The illustrative decoder of Figure 1 includes both an adaptive codebook (ACB) portion and a fixed codebook (FOB) 
portion. The ACB portion includes ACB 50 and a gain amplifier 55. The FCB portion includes a FOB 10. a pitch predic- 
tive filter (PPF) 20, and gain amplifier 30. The decoder decodes transmitted parameters (see G.729 Draft Section 4.1) 
and performs synthesis to obtain reconstructed speech. 

The FCB 10 operates in response to an index. I. sent by the encoder. Index I is received through swrtch 40. The 
FCB 10 generates a vector. c(n). of length equal to a subframe. See G.729 Draft Section 4.1 .2. This vector is applied 
to the PPF 20. PPF 20 operates to yield a vector for application to the FCB gain amplifier 30. See G-729 Draft Sections 
3 8 and 4. 1 .3. The amplifier, which appttes a gain, g c- from the channel, generates a scaled version of the vector pro- 
duced by the PPF 20. See G.729 Draft Sectton 4.1.3. The output signal of the amplifier 30 is supplied to summer 85 
(through swrtch 42). ^ ^ 

The gain applied to the vector produced by PPF 20 is determined based on information provided by the encoder. 
This information Is communicated as codebook indices. The decoder receives these indides and synthesizes a gam 
correction factor, y . See Q.729 Draft Section 4. 1 .4. This gain correction factor, y . is supplied to code vector prediction 
energy (E-) processor 120. Eiarocessor 120 detemiines a value of the code vector predicted enror energy. R . in 
accordance with the following expression: 

fl^"^ = 20 logyPB] 

The value of A Is stored In a processor buffer which holds the five most recent (successive) values of R. R^"^ repre- 
sents the predicted error eneigy of the fixed code vector at subframe n. The predicted mean-removed energy of the 
codevector Is formed as a weighted sum of past values of R : 



/«1 



where b = [0.68 0.58 0.34 0.19] and where the past values of R are obtained from the buffer. This predicted energy Is 
then output from processor 120 to a predicted gain processor 125. 

Processor 125 determines the actual energy of the code vector supplied by codebook 10. This is done according 
to the following expression: 

where / indexes the sarrples of the vector. The predicted gain is then computed as follows: 
where "Els the mean energy of the FCB {e.g., 30 dB) 
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Rnally, the actual scale factor (or gain) is computed by multiplying the received gain correction factor y by the pre- 
dicted gam. g'^ at multiplier 130. This value is then supplied to amplifier 30 to scale the fixed codebook comribution pro- 
vided by PPF 20. 

Also provided to the summer 85 is the output signal generated by the ACB portion of the decoder. The ACB portion 
compnses the ACB 50 which generates a excitation signal, >^(n). of length equal to a subframe based on past excitation 
signals and the ACB pitch-period. M. received (through switch 43) from encoder via the channel. See G 729 Draft Sec- 
tion 4.1.1 . -mis vector is scaled by amplifier 250 based on gain factor, ^p. received over the channel. This scaled vector 
IS the output of the ACB portion. 

Summer 85 generates an excitaton signal. u(n). in response to signals from the FCB and ACB portions of the 
decoder. The excitation signal. u(n). is appHed to an LPC synthesis filter 90 which synthesizes a speech signal based 
on LPC coefficients, aj, received over the channel. See Q.729 Draft Sectnn 4.1.6. 

Finally, the output of the LPC synthesis fDter 90 is supplied to a post processor 100 which performs adaptive post- 
filtenng (see G.729 Draft Sections 4.2.1 - 4.2.4). highijass filtering (see G.729 Draft Section 4.2.5) and ui«calina 
(see G.729 Draft Section 4.2.5). " 

B. Excitation Signal Synthesis During Frame Erasure 

In the presence of frame erasures, the decoder of Rgure 1 does not receive r^iaUe information fif it receives any- 
«iing at alQ from which an exdtation signal. u(n). may be synthesized. As such, the decoder will not know which vector 
of signal samples should be extracted from codebook 10. or what is the proper delay value to use for the adaptive code- 
book 50. In this case, the decoder must obtain a substitute excitation signal for use in synthesizing a speech signal The 
generation of a sub^itute excitation signal during periods of frame erasure is dependant on whettier the erased frame 
IS classified as voiced (periodic) or unvoiced (aperiodic). An indication of periodicity for the erased frame is obtained 
from the post processor 100. wrfiich classifies each properly received frame as periodic or aperiodic See G 729 Draft 
Section 4.2. 1 The erased frame is taken to have the same periodicity classification as the previous frame processed 
by the postfilter. The binary signal representing periodicity, v. is determined according to postfllter variable g^, Signal 
V = 1 1f gpH > 0; else, v = 0. As such, for example, if the last good frame was classified as periodic, v = 1 : otherarise v = 0. 

1. Erasure of Frames Representing Periodic Speech 

For an erased frame (e = 1) which is thought to have represented speech which is periodic (»^ = 1), the contribution 
of the fixed codebook is set to zero. This is accomplished by switch 42 which switches states (in the direction of the 
anow) from its nornnal (biased) operating position coupling amplifier 30 to summer 85 to a position which decouples the 
fixed codebook cortribution from the excitation signal, u(n). This switching of state is accompfished in accoidance with 
the control signal developed by AND-gate 1 10 (which tests for the condition that the frame is erased e = 1 and it was 
a penodfc frame, v = l). On the other hand, the contribution of the adaptive codebook is maintained in its normal ooer- 
ating position by switch 45 (since e = 1 but not_v = 0). 

The pitch delay. M. used by the adaptive codebook during an erased frame is detennined by delay processor 60 
Delay processor 60 stores the most recemiy received pitch delay from the encoder. This value Is ovenvritten with each 
successive pitch delay received. For the first erased frame following a "good" (correctly received) frame, delay proces- 
sor 60 generates a value for M which is equal io ttie pitch delay of the last good frame (/.e.. the previous frame) To avoid 
excessive periodicity, for each successive erased frame processor 60 increments the value of M by one (1) The proc- 
essor 60 restricts the value of M to be less than or equal to 143 samples. Switch 43 effects the appHcation of the pitch 
delay from processor 60 to adaptive codebook 50 by changing state from its normal operating position to its "voiced 
frame erasure" position in response to an incfication of an erasure of a voiced frame (since e 1 and = 1). 

The adaptive codebook gain is also synthesized in the event of an erasure of a voiced frame in accordance with the 
procedure discussed below in section 0. Note that switch 44 operates identically to switch 43 in that it effects the appli- 
cation of a synthesized adaptive codebook gain by changing state from its normal operating position to its "voiced frame 
erasure' position. 

2. Erasure of Frames Representing Aperiodic Speech 

For an erased frame (e = 1 ) virfiich is ttiought to have represented speech wvhich is aperiodic (f = 0). the conti-ibution 
Of the adaptive codebook is set to zera This is accomplished by switch 45 which suiritches states (in the direction of the 
anow) from its normal (biased) operating position coupling amplifier 55 to summer 85 to a position which decoinles the 
adaptive codebook contribution from the excitation signal, u(n). This switching of state is accomplished in accordance 
with the control signal developed by AND-gate 75 (which tests for the condition that the frame is erased e = 1 and it 

was ari aperiodic frame. nof_v= 1). Ontheoflierhand.tiiecontributionof the fixed codebook is maintained in its rormal 
operating position by switch 42 (since e = 1 but v = 0). 
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The fixed codebook index, I. and cxxiebook vector sign are not available do to the erasure. In order to synthesize a 
fixed codebook index and sign index from which a cod^)ook vector, c(n), could be determined, a rarKlom number gen- 
erator 45 is used. The output of the random number generator 45 is coupled to the fixed codebook 10 through switch 
40. Switch 40 is normally is a state which couples index I and sign information to the fixed codebook. However, gate 47 
applies a control signal to the switch which causes the switch to change state when an erasure occurs of an aperiodic 
frame (e = 1 and not__v =1). 

The random number generator 45 employs the function: 

seed = seed * 31821 + 13849 



to generate the f ixed codebook index and sign. The initial seed value for the generator 45 is equal to 2 1 845. For a given 
coder subframe, the codebook Index is the 1 3 least significant bits of the random number. The random sign Is the 4 least 
significant bits of the next random number. Thus the random number generator is run twice for each fixed codebook 
vector needed. Note that a noise vector couki have been generated on a sample-by-sample basis rather than using the 
15 rarKlom number generator in comtMnation with the FCB. 

The fixed codebook gain is also synthesized in the event of an erasure of an aperiodic frame in accordance with 
the procedure discussed below in section D. Note that switch 41 operates kdentically to switch 40 in that it effects the 
application of a synthesized fixed codebook gain by changing state from its normal operating position to its "voiced 
frame erasure** position. 

20 Since PPF 20 adds periodicity (when delay is less than a subframe), PPF 20 should not be used in the event of an 
erasure of an aperiodic frame. Therefore switch 21 selects either the output of FCB 1 0 when e = 0 or the output of PPF 
20 when e - 1 . 



C. LPC Filter Coeff toients for Erased Frames 



The excitation signal, u(n), synthesized during an erased frame is applied to the LPC synthesis filter 90. As with 
other components of the decoder which d^end on data from tiie encoder, the LPC synthesis filter 90 must have sub- 
stitute LPC coefficients, a,, during erased frames. This is accomplished by repeating the LPC coefficients of tiie last 
good frame. LPC coefficients received from the encoder in a non-erased frame are stored by memory 95. Newly 
30 received LPC coefficients overwrite previously received coefficients in memory 95. Upon the occurrence of a frame 
erasure, the coefficients stored in memory 95 are supplied to the LPC synthesis filter via switch 46. Switch 46 is nor- 
mally biased to couple LPC coefficients received in a good frame to the filter 90. However, in the event of an erased 
frame (e = 1), tiie switch changes state (in tiie direction of the arrow) coupling memory 95 to the filter 90. 

35 D. Attenuation of Adaptive and Fixed Codebook Gains 

As discussed above, both the adaptive arKi fixed codebooks 50, 10 have a corresponding gain amplifier 55. 30 
which applies a scale factor to tine codebook output signal. Ordinarily, the values of tiie scale factors for these amplifiers 
is supplied by the encoder. However, in the event of a frame erasure, the scale factor information is not available from 

40 the encoder. Therefore, the scale factor Information mi^ be synthesized. 

For t>oth the fixed and adaptive codetxx)ks, the synthesis of tiie scale factor is accomplished by attenuation proc- 
essors 65 and 115 which scale (or attenuate) the value of the scale factor used in the previous subframe. Thus, in the 
case of a frame erasure following a good frame, the value of the scale factor of the first subframe of tiie erased frame 
for use by the amplifier is the second scale factor from the good frame multiplied by an attenuation factor. In the case 

45 of successive erased subframes, the later erased subframe (subframe n) uses the value of the scale factor from the 
former erased subframe (subframe n-1) multiplied by the attenuation factor. This technique is used no matter how many 
successive erased frames (and subframes) occur. Attenuation processors 65. 115 store each new scale factor, whether 
received in a good frame or synthesized for an erased frame, in the event that the next subframe will be en erased sut>- 
frame. 

so Specifically, attenuation processor 115 synthesizes the fixed codebook gain. g^. for erased subframe n in accord- 
ance with: 

55 Attenuation processor 65 syrrthesizes the adaptive codebook gain, gp. for erased subframe n in accordance with: 
In addition, processor 65 limits (or clips) tiie value of tiie synthesized gain to be less than 0.9. The process of attenuat* 
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ing gains is performed to avoid undesired perceptual effects. 

E. Attenuation of Gain Predictor IVIemory 

As discussed above, there is a buffer wtiich forms part of E-Processor 1 20 which stores the five most recent values 
of the prediction error energy. This buffer is used to predict a value for the predicted energy of the code vector from the 
Fixed codetxxjk. 

However, due to frame erasure, there will be no infomiation communicated to the decoder from the encoder from 
which new values of the prediction enor energy Therefore, such values will have to be synthesized. This synthesis is 
accomplished by E-processor 120 according to the following expression: 



fl^"' = (0.25£ ft<"')-4.0. 
/■-I 



Thus, a new value for R is computed as the average of the four previous values of R less 4dB. The attenuation of 
the value of is perfomied so as to ensure that once a good frame is received undesirable speech distortion is not 
created. The value of the synthesized R is limited not to feill below -14dB. 

F. An illustrative Wireless System 

As stated above, the present invention has application to wireless speech communication systems Figure 2 
presents an illustrative wireless communicatton system employing an embodiment of the present invention Figure 2 
includes a transmitter 600 and a receiver 700. An illustralive embodiment of the transmitter 600 is a wireless base sta- 
tion. An Illustrative embodiment of the receiver 700 is a mobile user terminal, such as a cellular or wireless telephone 
or other personal communications system device. (Naturally, a wireless base station and user terminal may also include 
receiver and transmitter circuitry, respectively) The transmitter 600 includes a speech coder 610 which m^ be for 
exainple. a coder according to the G.729 Draft. The transmitter further includes a conventional channel coder 620 to 
provide error detection (or detection and correction) capability: a conventional modulator 630; and conventional radio 
transmission drcuitry; all well known in tfie art. Radio signals transmitted by transmitter 600 are received by receiver 
700 through a transmission channel. Due to. for example, possible destructive interference of various muWpath conpo- 
nents of the transmitted signal, receiver 700 may be in a deep lade preventing the clear reception of transmitted bits 
Under such circumstances, frame erasure may occur. 

Receiver 700 includes conventional radio receiver drcuitry 710. conventional demodulator 720 channel decoder 
730. and a speech decoder 740 in accordance with the present invention. Note that the channel decoder generates a 
frame erasure signal whenever the channel decoder determines the presence of a substantial number of bit errors (or 
unreceived bits). Alternatively (or in addition to a frame erasure signal from the channel decoder), demodulator 720 nrav 
provkle a frame aasure signal to ttie decoder 740. 

G. Discussion 

Although specific embodiments of this invention have been shown and described herein, it is to be understood that 
these embodiments are merely illustrative of the many possible specific arrangements which can be devised in appli- 
cation of the principles of Bie invention. Numerous and varied other anangements can be devised in accordance wHh 
these pnnaples by those of ordinary skill in the art without departing from ttie spirit and scope of the invention. 

In addition, although the illustrative embocfiment of present invention refers to codebook "amplifiers.- it will be 
understood by those of ordinary skill in the art ttiat this terni encompasses the scaling of digital signals. Moreover such 
scaling may be accomplished wHh scale factors (or gains) which are less than or equal to one (including negati\^e val- 
ues), as well as greater than ona ** »»«uv«v« 
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Phone: +49 6151833973, Foxi +496151837828. Email: gerhard.schroederMtl3.&.dbp.de 
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1 Introduction 

This Reconunead«ttoa coataina the description of an algorithm for the codiag of speech signals at d 
kbit/s using Conjugate-Structure-Algebraic*Code-Cxcited Linear- Predictive (CS-ACELP) coding. 

This coder is designed to operate with a digital signal obtained by first performing telephone 
bandwidth filtering (ITU Rec.G.710) of the analog input signal, then sampling it at SOOO Hz. 
followed by conversion to 16 bit linear PCM for the input to the encoder. The output of the decoder 
should be converted back to an analog signal by similar meant. Other input/output characteristics, 
such as tho«e specified by ITU Rec.G.Tll for 64 kbit/s PCM data, should be converted to 16 bit 
linear PCM before encoding, or from 16 bit linear PCM to the appropriate format after decoding. 
The bitstream from the encoder to the decoder is defined within this standard. 

This Recommendation is organized aa follows: Section 2 gives a general outline of the CS- 
ACELP algorithm. In Sections 3 and 4, the CS-ACELP encoder and decoder principles are ^i** 
cussed, respectively. Section 5 describes the software that defines this coder in 16 bit fixed point 
arithmetic. 
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2 General description of the coder 



The CS-ACELP coder i. b«ed on the coden^xcited linear-predictive (CELP, coding niodej The 
coder operates on ,peech fr«n«. of 10 m. corresponding f 80 sample, « a „„,pU„g goOO 
samples/sec. For every 10 msec fran». the speech signal is aaalyted to extract the parameters of 
.he CELP model (LP filter coefficient., adaptive and fixed codebook indices and gain.) These 
parameter, are encoded «.d transmitted. The bit ailocation of the coder p.r«n.te« is sho-n in 
Table I. At the decoder, these parameter. ar« used to retrieve the excitation and synthesis filter 

Table I: Bit ailoeation of the 8 kbit/. CS-ACELP algorithm (10 msec frame). 





Codeword 


Subjrame i Subfnmm f Total per fmme 


ISP 


10, 11, L2, L3 






18 


Adaptive codebooic deUy 


Pi. P2 


8 


5 


13 


Oelajr panty 


PO 


1 




1 


Faced codebook index 


01, 02 


13 


13 


26 




SI. S2 


4 




8 


Codebook gaina (<ta«e i) 


GAi, GA2 


3 






Codebook gains (suge 2) 


GBl, GB2 


4 






Total 








80 



parameters. The speech is reconstructed by ftltering this excitation through the LP synthesis filter, 
as is shown in Figure 1. The short-term witheri. filter is ba«d on a 10th order linear prediction 



eXCITATI ON 
COOEBOOK 



T 



OUTPUT 



PARAMEICR OGCGOIMa 

} 

fVCEnn BfTHTREAM 



Figure 1: Block diagram of conceptual CELP synthesis model. 

(LP) filter. The long-term, or pitch synthesis filter is implemented using the so^aUed adaptive 
codebook approach for delays less than the subframe length. Afl« computing the reconstructed 
speech, it is further enhanced by a postfilter. 
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2.1 Encoder 



The signal flow «t the encoder is shown in Figure 2. The input signal is high- pass filtered and scaled 
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Figure 2: Signal flow at the CS-ACCLP encoder. 

in the pre-proccMmg block. The pte-proceaied signal serves as the input signs! for all subsequent 
analysis. LP analysis is done once per 10 ma frame to compute the LP filter coeffidenu. These 
coefficienu are converted to line spectrum pain (LSP) and quantised using predictive two^age 
vector quantisation (VQ) with 18 biu. The excitation sequence is dmen by using an analysis- 
by-synthesis search procedure in which the error between the original and synthesixed speech is 
minimiied according to a perceptually weighted distortion measure. This is done by filtering the 
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error signal with a pmeptuai weighting filter, whose coefficienu are derired from the unquantized 
LP filter. The amount of perceptual weighting is made adaptive to improve the performance for 
input 5ignab with a fiat frequency-response. 

The excitation parameters (fixed and adaptive codebook parameters) are determined per sub- 
frame of 5 ms (40 samples) each. The quantized and unquantized LP filter coefficients are used for 
the second subframe, whUe in the first subframe interpolated LP filter coefilcients are used (both 
quantized and unquantized). An open-loop pitch delay is estimated once p^ 10 ma frame based 
on the perceptuaUy weighted ^>eedi signal. Then the foUowing operations are repeated for each 
subframe. The target signal *(n) is computed by filtering the LP residual througji the weighted 
synthesis filter iV(z)/A(z), The initial states of these filtm are updated by filtering the error 
between LP residual and excitation. This is equivalent to the common approach of subtracting the 
zero-input response of the weighted synthesis filter from the weighted speech signaL The impulse 
response. A(n). of the weighted synthesis filta is computed. CIoMd-loop pitch analysis is then 
done (to find the adaptive codeboofc deUy and gain), using the target *(n) and impulse response 
h(n), by searching around the value of the open-loop pitch delay. A fractional pitch deUy with 1/3 
resolution is used. The pitch delay is encoded with 8 bits in the first subframe and differentially 
encoded with 5 biu in the second subframe. The target signal x(n) is updated by removing the 
adaptive codebook contribution (filtered adapUve codevector), and this new target. zj(n), is used 
in the fixed algebraic codebook search (to find the optimum excitation). An algebraic codebook 
with i< bits is used for the fixed ts:* )k excitation. The gains of the adaptive and fixed code- 
book are vector quantised with 7 bits, (with MA prediction applied to the fixed codebook gain). 
Finally, the filter memories are updated unAg the determined excitation signaL 



2-2 Decoder 

The signal flow at the decoder is shown in figure 3. Fust, the parametm indices are extracted from 
the received bitstream. Tliese indices are decoded to obtain the coder parameters corresponding 
to a 10 ms speech frame. These parametcxs are the LSP coefficients, the 2 fractional pitch delays, 
the 2 fixed codebook vectom, and the 2 sets of adaptive and fixed codebook gains. The LSP 
coefficients are interpolated and converted to LP filter coeffictents for each subframe. Then, for 
each 40-sample subframe the following steps are done: 

• the excitation is constructed by adding the adaptive and fixed codebook vectors scaled by 
their respective gains, 
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Figure 3: Siga«l flow at the CS-ACELP decoder. 

the speech u reconstructed by filterittg the excitation through the LP synthesis filter, 

the reconstructed speech signal is passed through a post-proceasing stage, which comprisea 
of an adaptive postfilter baaed on the long-term and short-term synthesis filters, followed by 
a high-paaa filter and scaling operation. 
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2.a Delay 

This coder encodes speech and other audio signals with 10 ma firmmea. In addition, there is a 
look-ahead of 5 ms. resulting in a total algorithmie delay of 15 ma. All additional delay* in a 
practical implementation of this coww ore due to: 



35 



• processing time needed for encoding and decoding operatiotts, 

• transmiasibn time on the communication link, 

• multiplexing delay when fomhining audio data with other dnta. 
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2A Speech coder description 

The descriptioQ of the speech coding algorithm of this Reeonunendation is made in terms of 
bit-exact, fixed-point mtth^*tiff>l operations. The ANSI C code indicated in Section 5, which 
constitutes an integral part of this Etecommendation, reflecU this bit-exact, fixed-point descriptive 
approach. The mathematical descriptions of the encoder (Section 3), and decoder (Section 4), can 
be implemented in several other fashions, possibly leading to a codec implementa t ion not complying 
with this Recommendation. Therefore, the algorithm description of the C code of Section 5 shall 



24 



55 



BNSOOCID: <EP 0747884A2_L> 



16 



EP 0 747 884 A2 



Y. Shoham 8 

take precedence over the m^themaucai descriptions of Section. 3 and 4 whenever discrepancies are 
found. A noa-ejchaustive set of test sequences which can be used in conjunction with the C code 



are a\*ailable from the ITU. 



2.5 Notational conventions 

Throughout this document it is tried to maintain the foUowing notationai conventions. 

• Codebooki are denoted by caligraphic characters (e.g. C). 

• Time signals are denoted by the symbol and the sample time index between parenthesis (e.g. 

The symbol n is used as sample instant index. 

• Supetscript time indices (e.g ^«)) tefi» to thai variable corresponding to subframe m. 

• Superscripts identify a particular element in a '■^fflnVnt array. 

• A * identifies a quantised version of a parameter. 

• Range notations are done using square brackets, where the boundaries are included (e.g. 
[0.6.0.9]). 

« Lj denotes a logarithm with b' >0. 
Table 2 Usu the most relevant symbols used throughout this document. A glossary of the most 



Table 2: Glossary of symbols. 













LPsyathen Alter 




Eq.(l) 


tttpac higb-pssB filter 




Eq.(77) 


pitch postfilter 




Bq. (M) 


short-tenn postfite 


^.(•) 


Bq. m 


tili-campenatioa 61tee 




Bq. (90) 


ootpat high-pMS filter 


>'(») 


Eq.(46) 


pitch filter 




Eq. (2T) 


weighttag filter 



relevant signals is given in Table 3. Table 4 summarizes reievant variables and their dimension. 
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Coostant p&r&meten are listed in Table o. The acronyms used in this Recocnmeadatioa are sum- 
marized in Table 6. 



Table 3: < 'lossary of signais. 









w 




icnpiiisc nspoftM o£ wctgktisg md syathoia filtcn 




rtk\ 
rxKf 


AUln^^^W* ft^-*w*4n#U 




r {K} 


mn<4ifiM( 1 lit fh<~ni f #lif liiM aMsnenec 








15 


<w(m) 


f ttlTT'T SPCCCII imiM 




j(n) 






* !»/ 










20 


•/'(•) 


f aia-scaled posifiltervd omtpat 




»(•) 


recoastractcd speech sgsai 




r(n) 


rcaidval sicaal 




*(«) 


taxfet dcftal 


25 


«»(«) 


secoad (atgei ngBai 




»{») 


adap4iv« oodabook coaUibatioa 




e(n) 


fixed codebook coatiib«tiom 




»(») 


»(») • 


30 


«(») 


c(«)*' * 




•(») 


ezoftaftioft to LP >yntkMie filter 




<((•) 


cortelatiM between tatfet ncaai aad h(n) 




«•(■) 


ertOT sisnal 
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Table 4: Glonary of variables. 



Name 


Site 


Description 


99 


1 


adaptive oodebook gaia 


9* 


1 


fixed codebook gaia 


99 


1 


modified gaia for pitch poetfiber 


9w^* 


1 


pitch jgaia (or pitch poetfiltcr 


9t 


1 


Stia tenn •hoct-tena portfilter 


9t 


1 


gaia tcm tih postfiher 




1 


o^ea-loop pitch delv 


a« 


10 


LP co^cieatt 


ki 


10 


ceffff tioe ooeffio^nte 


o« 


2 


LAR coeflkaeata 


Mi 


10 


1ST oonaaCsed Etettoeadee 


9i 


10 


LSPcD^cksu 


rik) 


11 


condatioa meffirieau 




10 


LSP weightiag roefBoenf 


li 


10 


LSP qaaatinf oetpat 
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Table 5: Glossary of constants. 



Same 


Value 


Dticr%ption 


f 


SOOO 


saxnpliiic frequency 


JO 


60 


b^dwidtb espaasioo 




0.94/O.da 


weight Uctot peiceptasl weighting Alter 




0-60/[0.4-0.7j 


weight factor perceptual weighting filter 




0.55 


wei^t factor pott filter 




0.70 


wei^t factor po«t filter 




o.so 


wei^t factor pitch post filter 


Tt 


0.90/0.2 


weight factor tilt post filter 


c 


Tabte 7 


fixed (algebraic) codebook 


£0 


Sectioa 3.2.i 


moviog average predictor codebook 


CI 


Section 3.2.4 


First stage LSP codebook 


CI 


SectiDm 3.2.i 


Secoad tiaga LSP codebook (tow part) 


£3 


Sectioft 3.2.4 


Swoad stag* LSP codebook (high part) 


QA 


Sectioa 3.9 


Ficat stag* gaia codebook 


QB 


Sectioa 3.9 


Second stage gaia codebook 


uri«f 


Eq- (6) 


corxeiatioa lag wiadow 


XBlp 


Bq. (3) 


LPC aaalfaa wiadow 



35 

Table 6: Glossary of acronyms. 



iiciu'iyia 


DetcripHom 


CEI.P 


code-ezdted liaear-predictioa 


MA 


moviag average 


MSB 


most sigaificaat bit 


LP 


liaear predictioa 


LSP 


liae spectral pair 


LSF 


Uae spectral freqaeacy 


VQ 


vector qoantisatioa 



so 

2d 
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3 Functional description of the encoder 

In tbi5 sectron we describe the different function of the encoder represented in the blocks of 
Figure L. 



3.1 Pre-processing 

As stated in Section 2, the input to the speech encoder is assumed to be a 16 bit PCM signal. 
Two pre-processing functions ace appUed before the encoding process: I) signal seating, and 2) 
high-pass filtering. 

The scaling coosisu of dividing the input by a factor 2 to reduce the possibility of overflows 
in the fixed-point implementation. The higb*pass filter serves am a precaution against undesired 
low-frequency componenU. A second order pole/sero filter with a cutoff frequ^cy of 140 Bi is 
used. Both the scaling and high-pass filtering are combined by dividing the coeffidenu at the 
numerator of this Alter by 2. The resulting filter is given by 

^. . 0.4g3537IS - 0.g272470Sx-* ^ 0.463g3718r-» 
^ ' 1 - 1.9050465r-i + 0.9114024x-3 * 

The input signal filtered through ffkUi) is referred to as and will be used in aU subsequent 
codei wperations. 



3.2 Linear prediction analysis and quantization 

The short-term analysis and synthesis filters are baaed oa 10th order linear prediction (LP) filtets. 
The LP syntheds filter is defined as 

I 1 

where di.isl 10. are the (quantised) linear prediction (LP) coeffidenu. Short-term predic- 
tion, or linear prediction analysis is perfimned once per speech frame using the autocorrelation 
approach with a 30 ma asymmetric window. Every SO samples (10 ms), the autocorrelation coeffi- 
cienu of windowed speech are computed and converted to the LP coeffidenu using the Levinson 
algorithm. Then the LP coeffidenu are transformed to the LSP domain for quantisation and 
interpolation purposes. The interpolated quantiied and unquantised filten are converted back to 
the LP filter coefficienu (to construct the synthesis and weighting filters at each subframe). 
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3,2.1 Windowing and autocorrelation computation 

The LP anaiyaia window consisu of two parts: the first part is half a Hamming window and the 
second part is a quarter of a cosine function cycle. The window is giveu by: 



{ 0.54-0.46 cos (I 



-V(n)=< .../,,„..ooA „^200 239. 



There is a 5 ms loolcahead in the LP analysis which means that 40 samples are needed &om the 
future speech frame. This translates into an extra delay of 5 ms at the encoder stage. The LP 
analysis window applies to 120 samples from past speech frames, 80 samples from the present 
speech frame, and 40 samples from the future frame. The windowing in LP analysis U iUusttated 
in Figure 4. 



S SiiMi^^gibi^ LP WINDOWS 



SUBFRAMES 



Figure 4: Wmdowing in LP analysis. The different shading patterns identify corresponding exci- 
tation and LP analysis frames. 



The autocorrelation coeflcientt ot the windowed speech 

= wi^(n) s(n), n =: 0 239, (4) 

35 are computed by 

239 

r(*)« 21 *:=0,...,10. (5) 

To avoid arithmetic problems for low4evcl input signab the value of p(0) has a lower boundary of 
40 r(0) = 1.0. A 60 H» bandwidth expansion is appUed. by multiplying the autocorrelation coefficients 

„„(.,.-, [-i (3^)"] . .0. (« 

where /o = 60 Hs is the bandwidth expansion and /, = 8000 Hs is the samplmg frequency. Further. 
r(0) is multiplied by the white noise correction factor l.OOOl, which is equivalent to adding a noise 
floor at *40dB. 
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3.2.3 LeYixuca-Ourfaiii algorithm 

The modified Autocomiatton coefficieoU 

'•'(0) = LOOOl r(0) 

'■'(*)= ^ia,(k)rikh 4=1 10 

are used to obtain the LP filter coefficient, a,. , = 1 lO. by solving the set of equations 

10 

JJ«,r'(|.- *!) = _/(*). 10 

The «t of e^wion. « (8) i. «,|v«| u«ag the Urin««-Durbm Jgorithn,. Thi. rigorithm use. the 
following recunion: 

r(0) = r'(p) 

for i a 1 to 10 



**=-(z;:;i-r"''«-»j/£:(.-i) 



-1" = ** 

^(0 =.(1 - - I) , ./^(i) < 0 He, B(i) s 0.01 

The final solution is given aa s oj"', > = 1, .... 10. 
3.2.3 LP to LSP convvnion 

The LP filter coeffieieau a,, j » 1 10 aw converted to the line q>eetf.l pair (LSP) tepieaenta- 

tion for q«««i»a»ion and iateipoUtion purpoaea. For a lOth order LP filter, the LSP coefficients 
are defined aa the tooU of the som and difference polynomials 

and 

f1(x)=^^z)-*-"X(x-»), (10) 
respectively. The polynomial F{(z) is symmetric, and F^z) is antisymmetric. It can be proven 
that ail roou of these polynomials are on the unit circle and they alternate each other. F{(z) has 

31 
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a root ; = — I (w s f ) aad fii^) a root j = 1 = 0). To eliminate these two roou. we define 
tKe new poiynomiala 

and 

/■3(--) = i^^(^)/(l -.--')• (12) 
Each poiynomial ha» 5 conjugate roou on the unit circle (e*^***)- iherefore, the polynomiaU can 
be written as 

fi(r)= n (1-2^*^'' + -"') 

15 » 

and 

^'iU)= n (l-2«x-^ + z-'). (14) 

where qi = cot(uft) with w< being the line spectral frequencies (LSF) and they satisfy the ordering 
property 0 < wi < -^j < . . . < wio < r. We refer to 9i as the LSP coefficienU in the coone domain. 

Since both polynomial Fi(') ^ symmetric only the fir* 5 coefficirau of each 

polynomial need to be computed. The coefficients of these poiynonaiali are found by the recursive 
relations 
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/i(i+l)= o<^.i + aio-i - « =0 4, 

/2(«+l)= oi+i - oio-i + 1=0,..., 4, (13) 

where /i(0) = /a(0) = l.O. The LSP coefficienU arc found by evaluating the polynomiaU Fi(z) 
and Filz) at 60 poinU equaUy spaced between 0 and r and checking for sign changes. A sign 
change signifies the existence of a root and thtf sign change interval is then divided 4 times to 
better track the root. The Chefayshev polynomials are used to evaluate Fi{t) and F^Cr). In this 
method the rooU are found directly in the cosine doinain {qi}. The polynomials Fi{z) or Ft(zh 
evaluated at x s e^*^, can be writtoi as 

40 /•M==2e->*-C(*). (16) 

with 

= r,(*) + /(l)r4(*) + /(2)r3(r) + mmi^) + /WTU^) + /(5)/2, ( 17) 

^ where rm(x) = eos(mw) is the mth order Chebyshev polynomial, and /{»). « = 1 5, are the 

coefficienu of either Fi(z) or fa(x), computed using the equations in (15). The polynomial C(*) 
is evaluated at a certain value of x s cos(w) using the recursive relation: 
for it s 4 downto 1 



so 
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C(x) = - 6, + /(5)/2 
with initiaj values A, = 1 and 6« = 0. 

3.2.4 Quantuation of the tSP coeffldenta 

The LP filter coefficients are quantised using the LSP represencatioa in the frequency domain: that 
is 

= M«o«(»<), , - 1 10. (18) 

where w< «e the line specttml frequeneie, (LSF) in the norm.lued frequency domain (0. ,] A 
switched 4th order MA prediction i. uied to predict the curwnt «* of LSF coefllcieau. The 
difference b.tw«a the computed and p«dicted set of coefficient, i. ,u«ti«d u«ng a two^tag. 
vector quaatiier. The fort stage is . lO^earional VQ ^ codebook £1 with 128 entrie..(7 
b.U). The second stage i. a 10 bit VQ which ha. been implemented » a ,pUt VQ u«ng t,o 
^dimensional codebook., C2 and C3 containing 32 entries (S bits) each. 

To «cpl«« the quantisation process, U is convenient to first describe the decoding precess. 
Each coefficient is obtained from the sum of 2 codebookK 

i:2i(L2) .= 1,...,5. 
„(L3) i = a.....lO. 

where H. L2. and L3 are the codebook indices. To a«eid sharp .eson««« in the q««nti«d LP 
synthesis filters, the coeffidenU «, are aiTMged such that eoeffidenU have a minimum 

distance of J . The reaxrangeoKnt routine is shown below: 
/or* = 2.... 10 

iflk-i >k-J) 

<i-i=(l«+4_,-J)/2 
4=«i+<,-i + /)/2 

cad 

end 

This rearrangement process i. executed twice. First with a value of y = 0.0001. then with a value 

of / = 0.000095. 

.After this rearrangement ptoee». the quaatiied LSF coefficients for the current frame i». 
are obtained from the weighted sum of previous quantiser outpuU *<"-»>. and the current quantizer 

33 
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output Z^"*^ 



where mj are the cocfficieoU of the switched MA predictor. Which MA predictor to use is defined 
by & separate bit LO, At startup the initial values of /J*' are given by /, - ir/ll for alU < 0. 

After computing the corrcspoading filter is checked for stabdity. This is done as follows: 

1. Order the coefficient u/i in increasing value, 

2. If < 0.005 then = 0.005, 

3. If -^i< 0-OOOt then Qi^i =iji^ O.OOOl i 1 9, 

4. If ;0io > 3.135 then wio = 3.135. 

The procedure for encoding the LSF parameters can be outlined as follow.. For each otthe 
two MA predictors the best appronmation to the curtcnt LSF vector haa to be found. The best 
approximation b defined as the one that mimmixea a weighted mcan-squared error 



The weighu Wi are made adaprl - a function of the unquantised LSF coefficienU, 
f 1.0 i/ u;,-0.04»-l >0, 

\ I0(w?-0.04r- l)* + l o*A«rtw«e 
J 1.0 «/ - 1 > 0, 

35 w*2<i<93| lQ{ui^i - u;<.i - l)» + I oCAertwse 

if -wf +0.92r-l>0, 
" ^ - - ' 0.92*' - I )' + 1 (aherwise 



(22) 



- 1" 

\ l0(-wt+< 

In addition, this weights un and fif« are moltipUed by 1.2 each. 



The vector to be qoantixed for the current frame is obtained from 



(23) 



The fim codebook CI U searched and the entry LI that minimiw. the (unweighted) mean- 
squared error is selected. This is foUowed by a search of the second codebook £2, which defines 
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the lower part of the second stage. For each possible candidate, the partial vector w, m = I .5 

is reconstructed unng Eq. (20), and rearranged to guarantee a minimum distance of 0.000 1. The 
vector with index L2 which after addition to the first stage candidate and rearranging, approximates 
the lower part of the corresponding target best in the weighted MSE sense is selected. Using the 
selected first itage vector LI and the lower part of the second stage (L2), the higher part of 
the second stage is searched from codebook £3. Again the rearrangement procedure is used to 
guarantee a minimum distance of 0.0001. The vector L3 that minimizes the overall weighted MSE 
is selected. 

This process is done for each of the two MA predictors defined by £0. and the MA predictor 
LO that produces the lowest weighted MS£ ts selected. 

3.2.5 Int^polntioa of thm LSP coefficients 

The quantized (and unquantized) LP coeffidenu are used for the second snbfirame. For the first 
subframe, the quantized (and unquaatized) LP coefficienU are obtained from linear interpolation 
of the corresponding parameters in the adjacent subframes. The interpolation is done on the LSP 
coefficienU in the 9 domain. Let q^^^ be the LSP coefficients at the 2nd subfirame of frame m, and 
qt""'^^ the LSP coefficienU at the 2nd subfranae c£ the past frame (m * 1). The (unquantized) 
interpolated LSP coefficienU in each of the 2 subframes are given hf 

Suh frame I : qU ^ O.fi^j'"-*^ + 0.5^J'*\ i = 1 10. 

5ttA/rame2: q%^ is I,..., 10. (24) 

The same interpolation procedure is used for the interpolatioa of the qoantized LSP coefficienU 
by substituting « • by « in £q. (24). 

3.2.6 LSP to LP conversion 

Once the LSP cowffiri en U ace qtuotiied and interpolated, they are converted back to LP coeffidenu 
[oi]. The conversion to the LP domain is done as follows. The co^cienu of F\{z) and F^iz) are 
found by expanding E^. (13) and (14) knowing the quantized and interpolated LSP coefficienU. 
The following recursive relation is used to compute /t(t), t s I, .... 5, from qt 

for t s t f 9 5 

/i(0 = -2ff»-i - I) + 2A(i - 2) 
for J s i — 1 dewnio 1 

35 
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fiU) = flU) - -i??.-! /lO - I) + fiU - 2) 

end 

end 

with iaicial values /i(0) - 1 and /tC — U = 0- The coefficienu /{(i) are computed similariy by 
replacing ^li-i by qn 

Once the coefficteau and /{(i) are fouad. Fi(:) and ^3(-) are multiplied by I + and 
I - j"^ respectively, to obtain F{{z) and FUz); that is 

= 5. (25) 
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Finally the LP coefficients are found by 



/ 0.5/;(i)+ 0.5/5(1). < = i 5. 

\ 0.5/[(i'-5)-0.5y5(i-5). t=d 10. 

This is directly derived from the relatioa A(g) = (Fl(z) + Fi(x))/2, and because Fl(z) and Fi(z) 
25 are symmetric and antisymmetric polynomials, respectively. 



3*3 Perceptual weighting 

The perceptual weighting filter is based on the un quantized LP filter coefficienu and is given by 



The values of 71 and 77 determine the &equency response of the filter W(z), By proper adjustment 
of these variables it is posnble to make the weight&ig more effective. This is accomplished by 
making 71 and 7^ a function of ths spectral shape of the input signal. This adaptation is done 
once pM 10 ms frame, but an interpolation procedure for each first sub&ame is used to smooth 
this adaptation ptoceas. The specUal shape is obtained from a 2nd-order linear prediction filter, 
obtained as a by product 60m the Levinson-Durbin recursion (Section 3.2.2). The reflection 
coefficients ki, are converted to Log Area Ratio (LAR) coefficients ot by 

45 . ( l.O^kj) . 

'^"'^(To^rTo ^ ' 

These LAR coefficienu are used for the second subframe. The LAR coefficienu for the first 
sub&ame are obtained through linear interpolation with the LAR parameters from the previous 
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frame, and are given by: 

5u*/ram« 1 : ol. = O.oop-" +o..5o:"". 2 
5«»/r,m*2: 02,= -^j ^ 

The spectral envelope i, charactemed aa being either fla, (flat = 1) or tUted (flat = 0) For each 
subframe thi, characteruation i. obtained by applying a threshold function to the LAR coefficienu 
To avoid rapid changes, a hysteresis U used by taking into account the value of fiat in the previous 
subframe (m — 1), 



0 if «i < -X.74 and a, > 0.65 and /lati'»~ii - 1, 

1 if oi > -1.52 and a, < 0.43 and //a<<'»-U - 0. (30) 



//a<("-») otherwise. 



If the interpolated spe«trum for a subframe is da^nfied a. flat (/!««(".) = i). the weight faetms 
are set to T, = 0.94 and 7, = 0.8. If the spectrum is cl««fied a. tUted {/toC-. , O). the vain, 
of is set to 0.98. ^ the value of ^. is adapted to the n««gth of the re««.nce, in theXP 
synthe«. filter, but is bounded between 0.4 and 0.7. If a strong ,e«,n«ice is p«.ent. the value 
of T, « «st closer to the upperboond. This adaptation is achieved by a criterion ba^d on the 
mmimum di.t«»ce betw«n 2 s«ece«v« LSP coeflki«.u for the current subframe. TT- minimum 
distAoce is given by 

dmin ^ nitn[wi^i - w<] t = 1, , . . , 9. 
The following linear relation is used to compute 75: 



(31) 



ra - -6.0 • + 1.0. and 0.4 < 75 < 0.7 (32) 

The weighted speech signal ia a subframe is given by 
10 10 

sw(n) = s(n) + a,yis(n - 0 - 0,7^1. - 1), „ = 0 3». (33) 

•«» t.t 

The weighted speech signal sti^n) is used to find an estimation of the pitch delay in the speech 
frame. 



3.4 Open-loop pitch analysis 

To reduce the complexity of the search for the best adapUve codebook delay, the search range is 
limited around a candidate delay T^. obtained from an open-loop pitch analysis. This open-loop 
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pitch analysis is done once per frame (10 ms). The open-loop pitch estimation uses the weighted 
speech signal svi(n) of Eq. (33). and is done as follows: In the first step, 3 maxima of the correlation 

79 

R(k) = 5^ $w(n)*w{n - k) (34) 

ISO 

arc found m the following three ranges 



= 3 



80 143. 

40,-.. .79. 
20,. ...39. 



The retained maxima i = 1 3, ate normaUxcd through 

The winner among the three normalixed correlations is selected by favoring the delays with the 
values In the lower range. This is done by w»ghting the normalixed correlations corresponding to 
the longer delays. The best open-loop delay is determined as foUows: 

«'(r.,) = ii'(«i) 

>0.85ir(r,p,) 

end 
end 

This procedure of dividing the delay range into 3 section* and favoring the lower sections is 
used to avoid chocMg pitch mnitiples. 

3.5 Computation of the impulse response 

The impulse response, h(nh of the weighted synthe«* Wter »^(r)/A(z) is computed for each 
^ subframe. This impulse response is needed for the sewch of adaptive and fixed codebooks. The 

impulse response h{n) is computed by filuring the vector of coefficients of the filter .4(x/7i) 
extended by zeros through the two ftlten i/A{z) and i/A{z/y^). 
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3.6 Computation of the target signal 

The target .ignl fo, U.e «laptive codebook search is usually computed by subtracting the 
r«,.i„p«t respond of the -ei^ted ,ynthe«, filu, Wl^yA^ = M(Vt. )/(^f.M(./r,)I from the 
weighted speech signal ,w(n) of Eq. (33). This is done on a subframe basis. 

equivalent pnxedure for computing the target signal, which is used in this iUcommendation 
^ the filtering of the LP re«dual signal r(„, through the combination of synthesis filter 1/ 4(-, 
and the weighting filter M./y^)/A(z/y,). After determining the excitation for the «.bframe the 
.n,t«l states of these filter, are updated by filtering the difference between the LP re«d«al and 
excitation. The memory update of these filters is explained in Section 3.10. 

The residual signal r(„). which is needed for finding the target vector is abo used in the adapUv. 
codebook search to extend the past exdtation buffer. This simplifies the «lapti,e codebook search 
procedure for delay, less than the «.bfr.me siie of 40 a. will be explained in the next section. The 
LP residual is given by 

10 

r(n) = 4(») + J3a,s(„_.). „»o 39. (35) 

3.7 Adaptive-codebook search 

The adaptive^codebook parameters (or pitch p««,et««) are the delay and gnin. In the adaptive 
codebook approach for implementing the pitch filter, the excitation is repeated for delay, lee. than 
the subframe length. In the search stage, the excitation i. extended by the tP residual to simplify 
the closed-loop search. The wlaptive^ebook search is done every (5 m.) subbune. In the first 
subfr^ae. a fractional pitch delay Ti is used with a re^rtutioo of l/S in the n«ge (19^. 84§] and 
integers only in the range (85. 143J. For the second subfrMne. a deUy T, with a te«>lution of 1/3 
i- always used in the range ((m*)T. - Sf. (m*)r. + 41). where (i»*)T, is the newest integer to 
the fr«:tional pitch deUy Ti of the fim subfrwne. This rang, is adapted for the caaes where T^ 
straddles the boandariee of the delay range. 

For each subframe the optimal deUy is determined uring cloeed-loop uialysis that minimise, 
the weighted mean-squared error. In the fii« subframe the delay T, is found be searching a smaU 
range (6 samples) of delay value, around the open-loop delay (see Section 3.4). The search 
boundaria and t,^ are defined by 

'•nin — Ta^ — 3 
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< 20 then =: 20 

tftm^ > U3 then 
U« = 143 

For the second subframe. cloaed-loop pitch anaiysU U done &round the pitch selected in the first 
subframe to find the optimal delay Tj. The search boundaries arc between tmin - I and *mar + J. 
where tmin and im«x derived from Ti aa follow*: 

'5 = (int)Ti - 5 

»/<mm <20 then tm^n^^O 

i/Um > 143 then 

tm« = 143 
^min — 'mo* 9 

eni 



20 



25 



30 



The closed-loop pitch search minimixca the meao-squared weighted error between the originai 
and synthesized speech. This ta achieved by masdmising the term 

gy^^^t'^y^t") (37) 

where £(n) U the target signal and yi(n) ia the paat flltefed excitation at delay k (past excitation 
convolved with that the search range is limited around a preselected value, which is 

the open-loop pitch for the first subframe, and Ti for the second subframe. 

35 Xhe convolution yk(n) is computed Cor the delay t^in* and for the other integer delays in the 
search range * = + I it i» updated using the recursive relation 

yk(n) « ift-i(n - I) + tt(-*)fc(n), n s 39 0, (3«) 

where u(n), n s -143 39, is the excitation buffer, and y»-i(-l) = 0. Note that in the search 

stage, the samples ti(n), n s 0,. . .,39 are not knoim, and they are needed for pitch delays less 
than 40. To simplify the search, the LP residual is copied to u(n) to make the relation in Eq. (38) 
valid for all delays. 

For the determination of T,, and Ti if the optimum integer closed-loop delay U less than 84, 
the fractions around the optimum integer delay have to be tested. The frartional pitch search 
is done by interpolating the normaUxed corr^aUon in Eq. (37) and searching for its maximum. 
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The inurpolwioa u de«e uaing a FIR filter b««l oa a Hanrnuog windowed ,iae function with 
the nnc truncted « ill «d padded with zero, at ±12 {»„(12) = 0). The filter ha. it. c«.^ff 
frequency (-3dB) « 3600 H, in the over»mpled do««n. The interpolated value, of fHk) for the 
fraction. -|. - J. 0, J. and | are obtained using the interpolation formula 

/?(*). =gft(*-06»(t + ..3) + g/K*+i+.)6.,(3-/ + i.3). * = 0.1.2. ,39) 

where t = 0. L 2 correspond, to the fraction. 0. J. and |. respectively. .Vote th« it i. necessary 
to compute correlation term, in E,. (37) u.ing a r«.ge - 4.U« + 4. to aUow for the proper 
interpolation. 



3.7.1 GeneratioD of the adaptive codebook vector 



Once the noninteger pitch delay ha. been determined, the adM^tiv codebook vector v(n) i, com- 
pute! by interpolating the p«« excit«ion signal »(„) .» the giv«» integer dOv k and fr««ioD 



«(n) = J3u(„_*+.)*3o(«+i.3)+^u(n-* + l + ,)»,„(3-| + i.3). « = 0.....39. t =0 I 2. 

The interpolation filter *«, is b.«d on a Hamming wiadosved sine fiuietion. with the rinc truncated 
at iia and padded with lew. at - '6,o(30) = 0). The ftlte« ha. a cut^rff frequency (-SdB) at 
3600 Hz in the overMmpled domain. 

Codeword computation for adaptive codebook delays 

The pitch delay T, i. encoded with 8 bits in the Hnt subfrune and the relative delay in the second 
subframe is encoded witk S bit.. A fractional deley T i. (ep»eated by iu integet part lini)T, 
and a fractional part /r«c/3. /roc = -1. 0. 1. The pitch index PI is now encoded as 



PI = / ^^'•**>^« - 19) • 3 + /rec - 1. ./ r, = [19 851. frac a (-1, 0. IJ 

1 ((int)Tx - M) + 197. ./ Ti = (80 143J. /r«c = 0 



The value of the pitch delay T, is encoded relative to the value of r,. Using the same interpre- 
tation as before, the fractional delay T, represented by iu integer part (itU)T,. and a (nctional 
part /roc/3, frae s -1, 0. 1, is encoded as 

n = ((ii»t)r, - • 3 + frae + 2 (42) 
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where Um derived from Ti aa before. 



To make the coder more robmt against random bit errors, a parity bit PO is computed on the 
delay index of the first subftame. The parity bit u generated through an XOR operation on the 
6 most significant bits of PI. At the decoder this parity bit is recomputed and if the recomputed 
value does not agree with the transmitted value, an error concealment procedure is applied. 



(43) 



3.T.3 Computation of th« adaptive-codebook gain 

Once the adaptive-codebook delay is determined, the adaptivc^codcbook gain is computed as 
_ Hn)y(n) bounded hy0<9p< 1.2, 

' E«ov(«)y(«) 

^ where y(n) U the filtered adaptive codebook vector (xero-staie response of W(2)/A{z) to v{n)). 

This vector is obtained by convolving v{n) with A(n) 

y(n) = f^t'(OMn-0 « = 0 39. (44) 

■ 30 

Note that by maximising the term in Eq. (37) ia most case, g^ > 0. In case the dgnai contains 
only negative correlations, the value of 9f U set to 0. 



3.8 Fixed codebook: structure and search 

The fixed codebook is based on an algebraic codebook structure using an interleaved single^pulse 
permutation (ISPP) design. En this codebook. eadi codebook vector contains 4 non.«fo pulses. 
Each pulse can have either ths ampUtude. +1 or -I, and can assume the potttioos given in Table 7. 

The codebook vector c(n) is construct*! by taking a xero vector, and putting the 4 unit pulses 
at the found locationa, mnitipUed with their corresponding sign. 

dn) = sOtf(n - tO) - il) + ^Hn - i2) + sZS{n - i3). n = 0 39. (45) 

where 6(0) is a unit pulse. A spedal feature incorporated in the codebook is that the selected cod., 
book vector is filtered through an adaptive pre^filter P(r) which enhance, harmonic componenu 
to improve the synthesiaed spcedi quaUty. Here the filter 

P(z)^Uii-dz'^) ^^^^ 
42 
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Table T: Structure of Axed codebook C. 



Pnite 


Sign 


Postiions j 


iO 


sO 


0. 5, 10. 15, 20. 25. 30. 35 


ii 


si 


I. 6. 11. 16. 21, 26.31.36 


i2 


s2 


2. 7. 12. 17. 22. 27. 32, 37 


iZ 


s3 


3. 8, 13. 18. 23. 28. 33, 38 

4. 9. U. 19. 24. 29, 34. 39 



is u«*d. whtf* T U the integer component of the pitch delay of the eurtent ,ubfr«.e. and /I is a 
pitch gain. The value of ^ i, made adaptive by using the quanti«d «Japtiv. codebook g»n frotn 
the pKvioua subframe bounded by 0.2 and 0.8. 

0.2<^<0.8. (47) 
This filter enhances the hannonic stnictuie for delajv less than the subfrune sise of 40. Thk 
modification is incorporated in the fixed codebook search by modifying the impulse response A(n). 
according to 

A(fi) = A(n) + ^A(n-T), it = r,..,39. (49) 
3.8.1 Fixed*co4ebook search p.w.^ure 

The fUed eodebook.is searched by rainiinising the meaoHKiaared error between the weighted input 
speech sw(n) of Eq, (33). and the weighted reconstructed speech. The target ngnal used in the 
closed-loop pitch search is updated by subtracting the adaptive codebook contribution. That is 

= *(») - 9,v(n), n = 0 39, (49) 

where y(n) is the filtered adaptive codebook vector of Eq. (44). 

The matrix H is defined as the lower triangular Tbeplis convolutiott matrix with diagonal A(0) 
and lower diagonals A(l), . . , A(39). If e* is the algebraic codevector at index *. then the codebook 

is searched by maxiniisiiig the term 

C7_(Ef.ocf(n)n(n))^ 

where d{n) is the correlation between the target signal *,(n) and the impulse response A(n). and 
4 = H'H is the matrix of correiatioas of A(n). The signal d(n) and the matrix • are computed 



43 



35 



EP0747 884 A2 



V. Shoham 8 

before the codebook search. The clcmenta of rf(n) arc computed from 

39 

din) = 53 -^tOM* - n). n = 0 39. (51) 

isn 

and the element* of the symmetric matrix ♦ are computed by 

39 

0(1.7) = J3 AC* - OMn - Jh U > «)- (-52) 

Note that only the elemenU actually needed are computed and an efficient storage procedure 
has been designed to speed up the search procedure. 

The algebraic structure of the codcbook C allows for a fast search procedure since the codebook 
vector Ck contains only four nonzero pulses. The correlation in the numerator of Eq. (50) for a 
given vector is given by ^ 

C=:^aiif(mi). (53) 

•so 

where rm is the position of the ith pulse and Oi U iu amplitude. The energy in the denominator 
of Eq, (50) is given by 

3 3 3 

To simplify the search proeedt- • e puke amplitudes are predetermined by quantizing the 
signal d(n). This U done by setting the ampUtudc of a pulse at a certain position equal to the 
sign of d(n) at that position. Before the codebook search, th« following stepa are done. First, the 
signal d(n) is decomposed into two signaU: the absolute signal d'(n) = ld(n)| and the sign signal 
sign{d(n)l. Second, the matrix # is modified by including the sign information; that is, 

y(<.j)=«ignMi)lsign[d(i)l^i,j). 1 = 0 39, j^».....39. (55) 

To remove the factor 2 in Eq. (54) 

^'(t. i) = 0.5<K«. 0. » = 0 

The correlation in Eq. (53) ts now given by 

C = d'(mo) + d'(mt ) + <f(m7) + /(ms), 
and the energy in Eq. (54) is given by 

44 
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+ *'(»ni, + »'(mo, mi) 

+ «'('»3.'"3)+<9'(mo,m3)+o'(m,.ni,) + «'(m5.m,). (.,8, 

A (ocus^ search approach « to fu„h« «mpUfy the ,e«ch p«,cedure. In thU approach a 
preco^puted th^hold is t«ted before enterin, the last loop, and the .cop i. entered only if th« 
th,e,ho.d i, exceeded. Tbe maximum number of time, the loop can be entered i, fixed « that a 
lo- percentage of the codebook is searched. The thrchold is computed b.«d on the correlation 
C. The maximum ab»lute correlation «.d the average correl«ion due to .he contribution of the 
first three pulse., mnx, and «,^. are found before the codebook March. The threshold is given by 

The founh loop is entered only if the ab^lute correl«ion (due to three pulse.) exceed, thr, where 
0 < ^, < 1. The value of ir. control, the percentage of codebook .enrdi and it i. set here'to 0 4 
-Vote that thi. re«du in a variable search time, and to funhe, control the search the number 
t««e. the Is^ loop i. «te«d (for the 2 wbframe.) c«mot exceed a cert«„ m^timum. which i. .et 
here to 180 (the average worM caw per rabfirame is 90 time.). 



3.8.2 



Codeword cmnputatioa <rf the fixed codebook 



The pul« poeition. of the pulee. iO. il. and i2. are e«cod«l with 3 bit. each, while the poriUon of 
.3 » encoded with 4 bit.. Each pute ampUtude i. encoded with 1 bit. Thi. give. . totW of 17 bite 
for the 4 pulse.. By defining .= 1 if the sign i. podUve and , = 0 i. the sign i. ne,«ive. the sign 

codeword is obtained from 



and ciie fixed codebook codeword is obtained from 



m 



C = (iO/5) + 8 • (.1/5) + 64 * (.2/5) + 512 • (2 • (i3/5) + jz) (61) 
where >» - 0 if a = 3, 8, ... and i* = 1 if i3 = 4, 9, ... 

3.9 Quantization of the gains 

The adaptive^odebook gain (pitch gain) and the fixed (algebraic) codebook gain are vector quan- 
tized using 7 biu. The gain codebook search is done by minimiiing the cnean^uared weighted 
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error betweea original and reconstructed speech which is given by 

where x is the target vector (sec Section 3.6). y is the filtered adaptive codebook vector of Eq. (44). 
and I is the fixed codebook vector convolved with /i(n). 

n 

= ^^^^ 

• sO 

3.9.1 Gain prediction 

The fixed codebook gain gc can be expressed aa 

where is a predicted gain based on previous fixed codebook energies, and 7 ia a correction factor. 
The mean energy of the fixed codebook contribution is given by 

£:=tOlog(±g^). (65) 

After scaUng the vector c. with the fixed codebook gain ^c. the energy of the scaled fixed codebook 
is given by 20 log ye + Let E^'^^ be the mean-cemoved cn«gy (in dB) of the (scaled) fixed 
codeb-ok contribution at subframe — nven by 

= 20logtf. + £-£, (66) 
where £ = 30 dB U the mean energy of the fixed codebook excitation. The gain 9c can be expressed 
as a function of and E by 

,e = lO^**"***-*^'"- (67) 

The predicted gain ffj is found by predicting the log-energy of the current fixed codebook 
contribution from the log-energy of previous fixed codebook contributions. The 4th order MA 
prediction is done as follows. The predicted energy is given by 

where [61 6, ^ 64] = [0 68 0.58 0.34 0.191 are the MA prediction coefficients, and is the 
quantized version of the prediction error Z^*"* at subframe m, defined by 

_ £(m) . £(«), (69) 
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The predicted gain i. found by replacing by its predicted value in Eq (8T). 

The correction factor t 'a related to the gain-prediction error by 

««- = = 20 log(T). [71, 

3.9.2 Codebook tearefa for gaia qoantuatioo 

The ^laptive^eboolc g«n. «.d the factor y are vector <,«anti»d u«ng a 2-«age co.U»«a.e 
structured codebook. The flm «ag. con««. of a 3 bit two.dimea««uI codebook GA. «.d the 
second stage consist, of a 4 bit t»o^iiawn«onai codebook SB. TT« Bnt element in e«d> codebook 
repre^nts the q««.ti«d adaptive codebook g»n i,. and the second element repre^u the quaa- 
tued axed codebook gain correction &cto, j. Given codebook iadicen m and » fcr and SB. 
respeeUvely, the quantized adaptive.cedeba6k gain is given by 

i, = SAiim) + gBi(a), (72) 
and the quantized fixed-eodebook gain by 



7, = <rl 7 as a; (6A,(m) + 6B,(n)). 



(73) 



This conjugate stmetuie amplifi.. the codebook search, by applying a p,-«!eetio» pro«e«. 
The optimum pitch gain Mid &»kcodebook gain. „. are derived ffaHB Eq. (68). «ul ate u«»l for 
the pre-seleetion. The codebook CA contain* 6 entlie* in whidt the «»»d element (eoneqwnding 
to ,e) ha. in general larger value, than the fint element (cofteqKMding to,,). Thi. bias alloir. 
a preelection oing the vain, of In thi. pre«ieetion ptoce... a cluMer of 4 vector. whoM 
second element are do« to where b derived from and g,. Similarly, the codebook 
GB contain* l« entrie. in .hich have a bias tow»d. the first element (correepoading to g,). A 
cluater of 8 .«etoa whow Sm elemento are clo« to elected. Bence for each codebook 

the b«t 30 «£«Mlida«.veetom are fleeted. Thi. i. followed by an othanrtiv Kardt over the 
remaining 4 . 8 = 32 poaibilitie., such that the combination of the two indice. minimizes the 
weighted mean-squared error of Eq. (62). 
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3.9.3 Codeword computation for gain quantizer 

The codeword* GA and GB for the gain quantizer arc obtained from the indices corresponding to 
the best choice. To reduce the impact of single bit errors the codebook indices are mapped. 

3-10 Memory update 

An update of the states of the synthesis and weighting filters is needed to compuU the target signal 
in the next subframe. After the two gains are quantized, the excitation signal, u(n)» in the present 
subframe is found by 

u(n) = gpoin)'^gtC[n), n = 0 39, (74) 

where g, and jc are the quantized adaptive and fixed codebook gains, respectively. v{n) the adaptive 
codebook vector (interpolated past excitation), and c(n) is the fixed codebook vector (algebraic 
codevector including pitch sharpening). The states of the filters can be updated by filtering the 
signal r(n) - u(n) (difference between residual and excitatmn) through the filters l/Alz) and 
Al:/yi)/A(z/r2) for the 40 sample sub^ame and saving the states of the filters. This would 
require 3 filter operations. A simpler approach, which requires only one filtering is as follows. 
The local synthesis speech, is computed by filtering the excitation signal through l/A(z), 
The output of the filter due to the input r{n) - u(n) is equivalent to e(n) = s(n) - i(n). So the 
states of the synthesis filter l/-4(x) ore given by e(n), n = 30, . . . , 39. Updating the states of the 
filter A{z/ ji }/A{z/y2) can be done by filtering the error signal e(n) through this filter to find the 
perceptually weighted error ew(n). However, the signal ew(n) can be equivalently found by 

ew(n) = *(n) - i,y(n) + ie^ln). (75) 

Since the signals x(n), iK"). «d z(n) are available, the states of the weighting filter are updated 
by computing eter(n) as in Eq. (75) for n = 30 39. This saves two filter operations. 

3.11 Encoder and Decoder initialization 

AU static encoder variables should be initialised to 0, except the variables listed in table 8. These 
variables need to be initialized for the decoder as well. 
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4 Functional description of the decoder 

The signal flow U the decoder wa» shown in Section 2 (Figure 3). First the parameters are decoded 
(LP coefficienU, adaptive codcbook vector, ftxed codebook vector, and gains). These decoded 
parameters arc used to compute the reconstructed speech signal. This process is described in 
Section 4.1. This reconstructed signal is enhanced by a post-processing operation consisting of a 
posifilter and a high-paaa ftlter (Section 4.2). Section 4.3 describes the error concealment procedure 
used when either a parity error has occurred, or when the frame erasure flag has been set. 

4.1 Parameter decoding procedure 

The transmitted parameters are lUted in Table 9. At startup aU static encoder variables should be 
Table 9: Description of transmitted parameters indices. The bitstream ord 



iring is reflected by the 
ansmitted first. 





Description 




LO 


Switched predictor mdcx of LSP qaaatisct 




LI 


First stags vector of LSP qmaotiser 




L2 


Secoad stage lower vector of LSP qaaatiscr 




L3 


Secou. *e higher vector of LSP qsaatixer 




Pi 


Pitch deUy Itt sabfraate 




PO 


Parity bit for pitch 




SI 


Signs of pulses 1st sabframe 




CI 


rued codebook 1st sabframe 




GAl 


Gain codebook (stage 1) 1st smbframe 




GBl 


Gais codebook (stage 2) 1st sabframe 




n 


FiUk delay 2ad sub&ame 




S2 


Sigas of poises 2ad sabframe 




C3 


Fixed codebook Sad sab&aoM 




GA2 


Gain codebook (stage 1) 2nd sabframe 




GB2 


Gain codebook (stage 2) 2iid subframe 





ioitialized to 0. «cept the variable- Usud in Table 8. The decoding proc- U done in the foUowing 

order: 
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4.1.1 Decoding of LP filter parameters 

The received indices LO, Ll. L2, and U of the LSP quantuer a« used to reconstruct the ,uao. 
tized LSP coefficients using the procedure described in Section 3.2,4. The inurpolation procedure 
descnbed .n Section 3.2.o is used to obtain 2 interpolated LSP vector, (corresponding to 2 sub- 
frames,. For each .ubframe, the interpolated LSP vector is converted to LP filter coefficients a. 
which are used for sy-nthesizing the reconstructed speech in the subframe. 

The following steps are repeated for each subframe: 

1. decoding of the adaptive codebook vector. 

2. decoding of the fixed codebook vector, 

3. decoding of the adaptive and fixed codebook gains, 

4. computation of the reconstructed speech. 

4.1.2 Decoding of the adaptive codebook vector 

The received adaptive codebook index is used to find the integer and fractional paru of the pitch 
delay. The integer part (intm and ' -^ional part /roc of T, are obtained from PI as foUows: 
i/Pl < 197 

{int)Ti = (Pl+2)/3 + 19 
frac = PI - {ini)Ti*Z + 58 

else 

(int)Ti = PI - 112 
frac = 0 

end 

The integer aad fractional pan of are obtained from P2 and where t^^ is derived 

from PI at foUom 

tmin =5 (int)Ti - 3 

t/t^n < 20 then = 20 

'm^r = ^min + 9 

> 143 th€n 
= 143 

end • 
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Now T2 is obtained from 

{ini)T2 = (P2+2)/3 -I + U.« 
/rac = P2-2.((P2+2)/3-l)'3 

The adaptive codebook vector vin) is found by interpolating the past excitation u(n) (at the 
pitch delay) usin$ Eq. (40). 

4.1.3 Decoding of the fixed codebook vector 

The received fixed codebook index C is used to extract the pooitions of the excitation pulses. The 
pulse signs are obtained from 5. Once the pulse positions and signs ate decoded the fixed codebook 
vector c(n), can be constructed. If the integer part of the pitch delay, T. is less than the subframe 
size 40. the pitch enhancement procedure is applied which modifies c(n) according to Eq. (48). 

4.1.4 Decoding of the adepti^ end fixed codebook gains 

The received gain codebook index gives the adaptive codebook gain jp and the fixed codebook 
gain correction factor 7. This procedure is described in detail in Section 3.9. The estimated fixed 
codebook gain b found using Eq. (70). The fixed codebook vector is obtained from the product 
of the :uantiied gain correction factr- ^th this predicted gain (Eq. (64)). The adaptive codebook 
gain is reconstructed using Eq. (72). 

4.1.5 Compatetion of thm parity bit 

Before the speech is reconstructed, the parity bit is recomputed from the adaptive codebook delay 
(Section 3.7.2). If this bit is not identical to the transmitted parity bit PO, it is likely that bit 
errors occurred dtiring transmission and the error concealment procedure of Section 4,3 is used. 

4.1.6 Computing the reconstructed speech 

The excitation u(fi) at the input of the synthesis filter (see Eq. (74)) is input to the LP synthesis 
filter. The reconstructed speech for the subframe is given by 

10 

Hn)^u(n)'J2^ii{n'il n = 0 39- (76) 
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where a* are the inurpoUted LP filter coefficient!. 



The reconstructed speech i(a) is then processed by a post processor which is described in the 



next section. 



4.2 Post- 



processing 



Post-processmg con,i,u of three function.: adapUv* p«tfil.*ring. high.p« Muring, and signal 
a,^«:ahng. The adaptive poatfilte, i, the ca^^ie of .h«. fil.e«: . pj«h . 
short-teem p«tfUter and a tilt compensation filter followed by an adaptive gain 

cont«,l procedure. The po«filter i, updated every subframe of 5m.. The postfUtering 
« orgamzed a. follow,. rii«, the .yn.h,«. speech i. invert filtered though A(z/j^) to 
p»duce the re.idu.1 signal r(„). The signal r(«) i, u^ed to eompnf the pitch delay T and g«« 
9pi,. The Mgaal r(n) i. filtered through the pitch portfilter ff.iz) to produce the signal r'(n) which 
.» .U turn. i. filtered by the .yntheri. filter l/M'/j^)]. RnaUy. the rignal at the outpui 
the synthesi, filter l/b,fAi./y,)] i. pa-ed to the tUt eo«pe««io« filter if,(r) re«dting in the 
pctfiltered .ynthesi, speech signal ./(„). Adaptive gai. contiole i. then applied between ./(«) 
and i(n) resulting in the signal ./-(«). The high-p.., filtering and scaling operation operate on 
the postfiltered signal *f(n). 



4.2.1 Pitch postfilter 

The pitch, or harmonic, postfilter is given by 

^'^'^"IT3;<^t^oz-'). (77) 
where T Is the pitch delay and is a gain factor given by 

where is the pitch gain. Both the pitch delay and gain ace determined from the decoder output 
signal. .Vote that g^, m bounded I. and it is set to teto if the pitch prediction gain i. ies. that 
3 dS. The factor 7, controls the amount of harmonic postfiitering and has the value 7, = 0.5. The 
pitch delay and gain are computed ftom the residual signal r(n) obtained by filtering the speech 
s(n) through .4(7/Tn). which is the numerator of the short-term postfiltei (see Section 4.2.2) 

10 

rin) a i(n) 4- ^ ri^OiHn - i). (79) 
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The pitch delay is computed using & two pass procedure. The first pass selects the best integer To 
in the range (Ti — l.Ti + l). where Ti is the integer part of the (transmitted) pitch delay in the 
first subframe. The best integer delay is the one that maximiics the correlation 

39 

The second pass chooses the best fractional delay T with resolution 1/8 around Tq. This is done 
by finding the delay with the highest normaUzed correi&tioQ. 

where rk(n) is the residual signal at delay k. Once the optimal delay T is found, the corresponding 
correlation v»lue is compared against a threthold. If iPCT) < 0.5 then the harmonic postiilter is 
disabled by setting = 0. Otherwise the value of gpn is computed from: 



20 ^39 



g = Sngtt '^"^'^*^'*^ . bounded by 0 < gpit < 10. (82) 

The noninteger delayed signal ffc(n) is first computed using an interpolation filter of length 33. 
After the selection of nin) is recomputed with a longer interpolation filUr of length 129. The 
new signal replaces the previous one only if the longer filter incrcaaea the value of Bf(T), 



4.2.2 Short-terta poatiiitw 

The short-term postfilter is given by 

where A(r) is the received quaatixed LP inverse filter (LP analyds is not done at the decoder), 
and the factors jn, and 74 control the amount of short-term poatfiltering, and are set to = 0.55. 
and 7j = 0.7. The gain tmn $j is calculated on tha truncated impulae responae, A/(n). of the 
filter A{zhn)/A(xh4) «d pven by 



45 4.2.3 Tilt compensatioo 



FinaUy. the filter fft(r) compensates for the tUt in the short-term postfilter ff/(r) and is giv«i by 

ff.(-') = -(l+Ti*i*"')* 
so 9t 
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where r.*, i» a tiU factor, being the first reflection coefficient calculated 



on hf{n) with 



;=0 



The ,,i„ .rm = t - con,pen«te, for the decreasing effect of,, in 
.t ha, been iho»„ that the product filter /f;(..)/f.(.., ha, generally no gain. 

Two v^ue, for y. are u«d depending on the «gn of *. . If *. i, ^ o.9. and if « 

positive. 7, = 0.2. 



4.2.4 Adaptive gain control 

Adaptive gain control i, u«d to compendia for^ain difference, between the recon«r«cted speech 
s.g«al Hn) and the pctfiltered aignal ./(„). The gain scaling factor G tor the present .„blra«e 
i, computed by 

The gam-scaled postfiltered signal •/'(n) is given by 



(88) 



(89) 



'A") = »(fi)*/(n>. n=0 39. 

where g(n) is updated on a svnpl..- ^p|e ba«. and given by 

g(n) = 0.85»(n - I) + 0.15 C, n s 0 39. 

The initial value of ^-l) s 1.0. 

4.2.5 High-pas, fllt«riag aad ap-Kaling 

A high-p,« filter at a cutoff ft«p«ney of 100 H. i, appUed to the re«on«ructed and postfiltered 
speech sf(n). The filter ia givoi by 

ff^.(t) - 0 93980581 - l.a795834z-« ^■ 0.93980S81«-» 
1 - 1.9330735*-i + 0.93S89199r-» ' 

L>sc«ling consist, of multiplying the high-paas filtered output by a factor 2 to retrieve the 
tnput signal level. 
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4.3 Concealment of frame erasures and parity errors 

An error coacealmeni procedure haa beea incorporated in the decoder to reduce the degradatioos 
in the recocutnicted speech because of frame erasures or r^dom errors in the biutream. This error 
concealment process is functional when either i) the frame of coder parameters (corresponding to 
a 10 ms frame) has been identified as being erased, or ii) a checksum error occurs on the parity 
bit for the pitch delay index PI. The latter could occur when the bitstream has been corrupted 
by random bit errors. 

If a parity error occurs on PI, the delay value Ti is set to the value of the delay of the previous 
frame. The value of T2 is derived with the procedure outlined in Section 4.1.2, using this new value 
of Ti . If consecutive parity errors occur, the previous value of Ti , incremented by 1, is used. 

The mechanism for detecting frame erasures is not defined in the Recommendacion, and will 
depend on the application. The concealment strategy has to reconstruct the current frame, baaed 
on previously received information. The method used replaces the missing excitation signal with 
one of similar characteristics, while gradually decaying its energy. This is done by using a voicing 
classifier based on the long-term prediction gain, which is computed as part of the long-term 
postfilter analysis. The pitch postfilter (see Section 4.2.1) finds the long-term predictor for which 
the prediction gain is more than 3 dB. Thia is done by setting a threshold of 0.5 on the normalized 
correls*ion Bf(k) (Eq. (dl)). For th* concealment process, these frames will be classified as 
periodic. Otherwise the frame is declared nonperiodic An erased frame inheriu iu class from 
the preceding (reconstructed) speech frame. Note that the voicing classification U continuously 
updated based on this reconstructed speech signal. Hence, for many consecutive erased frames the 
classification might change. Typically, this only happens if the original classification was periodic. 

The specific steps taken for aa erased frame are: 

1. repetitioa of the LP filter parameters. 

2. attenuation <^ adaptive and fixed codebook gains, 

3. attenuation of the memory of the gain predictor, 

4. generation of the replacement excitation. 
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4.3.1 Repetition of LP fUter parameters 



The LP pa««««. of the i« good frame .re u«d. The nate. of the LSF p,edicU>, contain the 
value, of the received codeword. Since the c^rent codeword is not available it i. computed 
from the repeated LSF parameters ^, and the predictor memory from 



(91) 



4-3.2 Attenuation of adaptive and fixed codebook gains 
An attenuated version of the previous fixed codebook ^ain is used. 

The same is done for the adaptive codebook gain. In addition a clipping operation b used to keep 
its value below 0.9. 

air' = 0.9,(— ») and gjr* < 0.9. (93, 
4.3.3 Attenuation of the memory of the gain predictor 

The gain predictor use. the ener», previously selected codebook.. To allow for a snwoth 
coMinuatioa of the coder once good frame. «e ««ived. the memonr of the g«n predictor is 
updated with an aitemiated ve<»on of the codebook energy. The value of fo, the current 
subframe n is set to the averaged quantised gain prediction error, attenuated by 4 dB. 

ii<"> = (0.25 ^ O) - 4.0 and > -14. ,94, 
4.3.4 Generation ei tbm replacement excitation 

The excitation a.«i depend, on the periodicity dasHfication. If the Im correctly received frame 
wa. clusified a. periodic, the current frame is conndeted to be periodic a. welL In that caae only 
the adaptive codebook i. used, and the fixed codebook contribution i. set to ser* The pitch delay 
is based on the last correctly received pitch delay and is repeated for each niccessive frame. To 
avoid excessive periodicity the delay is incresMd by one for each next subframe but bounded by 
143. The adaptive codebook gain is bsMd on an attoinated value according to Eq. (M). 
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if the last correctly received frame ctauifled aA nonperiodtc. the current frame is coasidered 
CO be noopenodk as wet!, and the adaptive codeboolc contribution is set to zero. The fLxed code book 
contribution ij generated by randomly selecting a codebook index and sign index. The random 
generator is baaed on the function 

seed = seed « 31821 + 13849. (95) 

writh the initial seed value of 21845. The random codebook index is derived from the 13 least 
significant bits of the next random number. The random sign is derived from the 4 least significant 
bits of the next random number. The fixed codebook gain is attenuated according to Eq. (92). 
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5 Bit-exact description of the CS-ACELP coder 

ANSI C code simulaUngthe CS-ACELP coder in 16 bit fixed-point is available bom ITU-T. The 
following seetioiu summariie the use of this simulation code, and how the softw«« i. organised. 

5.1 Use of the simulation software 

The C code consists of two main programs coder, c. which simulates the encoder, and decoder . c. 
which simulates the decoder. The encoder is run as follows: 

coder iapntf lie batrsaaflle 
The inputiiie and outputfile are sampled data files containing 16-bit PCM signals. The bitstream 
file contains 91 16-bit words, where the first word can be used to indicate frame erasure, and the 
remaining 80 words contain one bit each. The decoder takes this bitstream file and produces a 
postfiitered output file containing a 16-bit PCM signal, 
decoder batreaafUe eutputtUa 



5.2 Organization of the simulation software 



In th? fijted-poiat ANSI C simulatin- only two types of fixed-point dsu are used as is shown ia 
Table 10. To facilitate the implementation of the airanlation code, loop indices. Boolean values and 

Table 10: Oat* types used ia ANSI C "tTwlation. 



1 Tvpe 




ATm. valme 






0x70 


0x8000 


sisaed 2'* coaipiemtat 10 bit wotd 


1 Word32 




0x80000000 L 


stgacd 2's complemeBt 32 fait wotd 



flags use the type Fl««, which would be either 16 bit or 32 biu depeodiag on the target platform. 

AIJ the computations are done using a predefined set of basic operators. Tbe description of 
these operaton is givoi in Table 11. The tables used by the simulation coder ue summarized in 
Table 12. These main programs use a library of routines that are summarised in Tables 13, 14, 

and 15. 
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Table 11: Basic operations used in ANSI C simulation. 
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Operation 


Descr\ptton 


Uor<116 ««tv«<tford3a L.vurl) 


Limit CO 16 bita 


Uordl6 «dd(tfordl6 T«rl. tfordlS vu:2> 


Short Addition 


Vordie 3ub(Wordt6 varl. Vordl6 irar2) 


Short sobtractioa 


Vordie «b«.«(Wordl« «*rl) 


Short &bs j 


Vordl6 9hl(Vordl6 varl. tfordlS var2> 


Short shift (eft 


Vordie slir(Vordl6 vmrl. Vordl6 var2> 


Short shift right 


Vordl6 ■iilt(Wordl« varl, Wordl6 v«r2> 


Short moltipiicatioa 


tford32 L.aalt(Vordl6 varU VordlS var2) 


Loag multipUcatiott 


tfordl6 a«gat«(Verdl6 varl) 


Short aegau 


Vordie «atract.h(Vord32 L.Yirl) 


Extract high 


Vordie «atract.l(«erd32 L.T«rl> 


Extract tow 


Vordie roiiitd(Vord32 L.vart> 


Roaad 


Vord32 L.aacCVerdaS L.Tar3, Vordie varl, Vordie Tar2) 


Mac 


Vord32 L.aau(Vord32 L.Yar3. Vordie rarl, Vordie ▼•r2) 


\Ua 


Vord32 L.Bacl0(Vord32 L.Tar3, Vordie varl. Vordie v«r2> 


Mac withoat sat 


Vord32 L.aaula(Vordl2 L.Tar3. Vordie varl, Vordie var2) 


Msa withoat sat 


Verd32 L.add(Vord32 L.varl. Vord32 L.w2> 


Loaf idditioB 


Verd32 L.aiib(Vord32 L.varl. Vord32 L.var2> 


Loaf sabtractioa 


V'«rd32 L.«dd.c(Vord32 L.Tarl. »««rd32 L.var2> 


Loaf add with c 


Vord32 l,.sab.c(Voxd32 Levari. Vord32 L.var2> 


Loaf sab with c 


Vord32 L.no8ato(Vord32 L.varl) 


Loag negate 


Vordie aolt^r (Vordie varl. Vordie var2> 


MaltipUcatioa with roand 


Vord32 L.«ai(Vord3a L.varl. Vordie var2) 


Loag shift left 


Vord32 L.ahr(Vord33 Lw^arl. Vordie v«r2) 


Loag shift right 


Vordie shr.r (Vordie varl. Vordie var2} 


Shift right with rooad 


Vordie Bac.r(Vordl2 L.var3, Vordie virl. Vordie var2) 


Mac with rounding 


Vordie Bau(Vord32 L.var3. Vordie varl, Vordie var2) 


Msa with roundiag 


Vord32 t..d«pMit.h(«ordie varl) 


16 bit varl -l MSB 


Vord32 L.dopoalt.l<Soxdie varl) 


16 fait ml -i LSB 


Vord32 L.ahr^(Verd32 L.varl. Vordie var2> 


Long shift right with loand 


Vord32 L.aba(Vord32 L.varl) 


Long aba 


Vord32 L.sat(Vord32 L.varl> 


Long sataratioa 


Vordie aora.a (Vordie varl) 


Short oona 


Verdie diT.s(Verdie varl. Vordie Tar2) 


Short divtsioa 


Vordie aoTs.l(Vord32 L.varl) 


Long norm 
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Table 12: Suminaxy of tables. 



1 File 


1 Tabic nAnt6 




Description 




tab.bap.a 


28 


apsampiiag filter for poatfilter 


t%b.2iap*c 


tabjmp.l 


112 


apsampUng filta for postfilter 


intar.3.c 


iatar.3 


13 


FIR filter for iaterpoUtiag the correUtioa 


pr«4.It3.c 


iatar.3 


31 


FIR filter for iaterpoUting paal exidtattoa 


lapcb.tab 


Ispcbt 


128x10 


LSP qoaatttct (first stage) 


lapcb.tab 


lapcbS 


33x10 


LSP qnaatizer (second stage) 


lapcb.tab 




3x 4 xlO 


MA predacton ia LSP VQ 


lapcb.tab 




2x 10 


ased m tSP VQ 


Ispcb.t^ 


fg^aOB. IBV 


2x10 


aaed ia LSP VQ 


qtia^ala.tab 


gbkl 


8x2 


codebook GA ia gain VQ 




gbk3 


16x2 


codebook GB ia gaia VQ 




upi 


8 


oacd ia gaia VQ 






8 


used ia gaia VQ 


qna^ai&.tftb 


•^2 


16 


oaed ia gaia VQ 




iBa21 


16 


oaed ia gaia VQ 


vixutov.tab 


visdov 


2^ 


LP aaalyna wiadow 


lag.viitd.tab 


lag.li 


10 


Ug wiadoir for baadwidtk expaaaioa (bigb part) 


lag.«iad.tab 


lag.1 


10 


lag wtadoOT for baadwidtk expaasioa (low part) 


grid. tab 


grid 


61 


grid poiata ia LP to LSP coaverBoa 


iav.vqrt.tab 


tabla 


49 


tookap table ia iamae sqamre root oompatatiea 


log2,tab 


tabla 


33 


lookap tabfo ia baae 2 bguttka oompatatioa 


lap.Isf .t«b 


tabla 


65 


lookup table ia LSF to LSP coavernoa aad nee vena 


lap.Iaf .tab 


•lopa 


64 


liae abpe* ia LSP to LSF coaTeraoa 


pQ«2 . tab 


tabla 


33 


lookup table ia 2* compatatioa 


acalp.h 






prototypes fior fixed codebook search 


IdSk.h 






prototypes aad coastaats 


typadaf .h 






type definitioBS 
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Table 13: Summary of encoder specific routines. 



10 


FUenamt 


Description 


ac«ip_co.c 


Search fixed codebook 




autoeorr.c 


Compnte MtocorreUuoa br LP analysis 




u.lsp.c 


compute LSPs from LP coefficienu 


15 


cod.ldSk.c 


encoder rontine 


convolT« . c 


convotutioa operatioa 






conpate corrdattoa terms for gsta quaatizatioa 






eacode adaptive codebook iadex 




glitch. c 


compute adaptive codebook gain 


20 


gmxnpr«d.c 


gain predictor 




iat.lpc.c 


inUrpoUtioB of LSP 




iAt«r.3.c 


&aciioBai delay interpolatioa 




laf.«iiid.e 


lag* windowing 


25 


X«viBsoa.c 


trrtnson recursion 




I«p«nc.c 


LSP encodiac tontiBa 




Upg«tq.e 


LSP qnaatixer 




Ispgatt.c 


compute LSP ^aotizcr distortion 


30 


IspS«tw.c 


comn">t« LSP weiglitt 




Isplwt.c 


select LSP MA predictor 




I«ppr«.c 


pre-aelection first LSP codebook 




I«pp«v.c 


LSP predictor toutiBen 


35 


lap««ll.c 


fimt stage LSP qoantiscr 




lsps«12.c 


second stage LSP quantiser 




Iflpstab.c 


stafaifity test for LSP quantizer 






doeed-loop pttdi search 


40 


piteluol.c 


open-loop pitch search 




pn..proc.c 


pre-processiag (HP filtering and scaling) 




pvf .c 


computation of perceptual weighting coefficiettta 




^iia_<aia*c 


gain quantixer 


45 


qua.lAp.e 


LSP quantixer 




r«lspv«.c 


LSP quantixer 
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Table U: Summary of decoder specific routines. 



1 Filename 


DtMcripiion 




decode LP inform&tioa 


d«.«c«lp.c 


decode algebraic codebook 




decode gains 




decode adaptive codebook index 


d«c.ld8k.c 


decoder roatine 


lspd«c . c 


LSP decodiag roatine 




post pcocessiag (HP filtefiag and acaliag) 


pr«d.It3.c 


geseratiott of adaptive codebook 


pst .e 


postfilter nmtiBes 



Table 15: Summary of general routines. 



Fiienamm 


De9eription 


basicop2.c 


banc operatots 


bits.c 


bit ^^lupalatioB rontinea 


gaiApred.c 


gain predictor 


int.lpc.c 


tnterpolatioa of LSP 


iat«r.3 . e 


fractional delay iaterpotatioa 


Isp.as.c 


compnte LP fxom LSP codBdenu 


Up.laf .c 


converaion between LSP and LSF 


lap.la<3.c 


high prcdaoa conversion between LSP and LSF 


Xapexp.c 


expanaoQ of LSP coeffidesu 


Ispatab.e 


stabiiity teat for LSP ^oaatiscr 


p..paritj.c 


compote pitch paxity 


pr»d.It3.c 


generation of adaptive codebook 


rendoB.c 


random generator 


residn.e 


corapate residaal signal 


»yn.filt.c 


fynthesia filter 


veigfat.a.c 


bandwidth expansion LP coeffidenta 
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Claims 



1. A method for use in a speech decoder which fails to receive reliably at least a portion of a current frame of com- 
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pressed speech information, the speech decoder including a codebook memory and a signal amplifier, said mem- 
ory and amplifier for use in generating a decoded speech signal, the compressed speech information including a 
scale-factor for use by said amplifier in scaling a signal reflecting a codebook vector, the method comprising: 

attenuating a scale-factor corresponding to a previous frame of speech; and 

amplifying a signal reflecting a codebook vector corresponding to said current frame of speech in accordance 
with said attenuated scale factor. 

2. The metfiod of claim 1 wherein the codebook memory comprises a fixed codebook. 

3. The method of claim 2 wherein the st^ of attenuating conprises multiplying said scale factor by 0.98. 

4. The method of claim 1 wherein the codebook memory comprises an adaptive codebook. 

15 5. The method of claim 4 wherein the step of attenuating comprises multiplying said scale factor by 0.9. 

6. The method of claim 1 wherein said codebook memory is coupled directiy to said arrplifier. 

7. The method of claim 1 wherein the speech decoder comprises a CELP speech decoder, 

so 

8. The method of claim 1 wherein the previous frame of speech comprises the a frame of speech immediately prior to 
the current frame of speech. 

9. The method of claim 1 wherein the scale-factor being attenuated Is determined based on coded speech information 
25 received from a speech encoder. 

1 0. The method of claim 1 wherein the scale factor being attenuated is comprises a previously attenuated scale-factor. 

1 1 . The method of claim 1 wherein the step of amplifying comprises applying a scale-factor to said signal. 

30 

1 2. The method of claim 1 wherein the failure to receive reliably at least a portion of the cunrerrt frame comprises a fail- 
ure to receive any bits representing a frame of speech information. 

13. An apparatus for synthesizing a si^al representing speech information, the apparatus failing to receive reliaksly at 
35 least a portion of a current frame of compressed speech information, the apparatus comprising: 

a codebook memory comprising codebook vector signals; 

means for attenuating a scale-factor corresponding to a previous frame of speech; and 
a signal amplifier, said signal amplifier applying the attenuated scale-factor to a signal reflecting a codeboc^ 
40 vector corresponding to said current frame. 

14. The apparatus of claim 13 wherein the codebook memory further comprises a fixed codebook. 

15. The apparatus of claim 14 wherein the means for attenuating comprises means for multiplying said scale-factor by 
45 0.98. 

16. The apparatus of daim 13 wherein the codebook memory comprises an adaptive codebook. 

1 7. The apparatus of daim 1 6 wherein the means for attenuating comprises means for multiplying said scale factor by 
50 0.9. 

18. The apparatus of daim 13 wherein said codebook memory is directly coupled to said amplifier. 
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(54) Codebook gain attenuation during frame erasures 



(57) A codebook-basal speech decoder which fails 
to receive reliably at least a portion of a current frame of 
compressed speech information uses a codetxx>k gain 
which is an attenuated version of a gain from a previous 
frame of speech. The speech decoder includes a code- 
kx)ok memory and a signal amplifier. The memory and 
amplifier are i^ed in generating a decoded speech sig- 
nal based on corr^ressed speech information. The 



compressed speech information includes a scale-factor 
for use by the amplifier in scaling a codebook vector. 
When a frame erasure occurs, a scale-factor corre- 
sponding to a previous frame of speech is attenuated 
and the attenuated scale factor is used to amplify the 
codebook vector corresponding to the current erased 
frame of speech. 



t — f 



FIB. 1 



sia 



CO 
< 

CD 
CO 

1^ 
O 

Q. 

LU 




Primed by Rank XaroK (UK) Businoss Services 
2.14.12/3.4 



EP 0 747 884 A3 



J 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



A^pUcatioa NomW 

EP 96 30 3853 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Caucory 



Citadoo of docnawat with iodicattoB, wbcre apprapriate, 
of fclcvant passages 



Rdevaat 
to daim 



CLASSinCAnON OF THE 
APPUCATION (ULCL») 



A 

P.X 



PROCEEDINGS OF THE GLOBAL 
TELECOMMUNICATIOMS CONFERENCE (GLOBECOM). 
SAN FRANCISCO, NOV. 28 - DEC. 2. 1994, 
vol. 2 OF 3, 28 November 1994, INSTITUTE 
OF ELECTRICAL AND ELECTRONICS ENGINEERS, 
pages 848-852, XP000488660 
HUSAIN A ET AL: "CLASSIFICATION AND 
SPECTRAL EXTRAPOLATION BASED PACKET 
RECONSTRUCTION FOR LOW-DELAY SPEECH 
CODING" 

* abstract; figure 2 * 

* page 849, right-hand column, line 4 - 
line 23 * 



EP 0 673 G16 A (AT & T CORP) 20 September 
1995 

* abstract * 

US 5 383 202 A (EDGAR GREGORY A ET AL) 17 
January 1995 

* abstract; claim 1 * 

SPEECH COMMUNICATION, 

vol. 12, no. 2, 1 June 1993, 

pages 103-111, XP000390528 

JUIN-HWEY CHEN ET AL: "THE CREATION AND 

EVOLUTION OF 16 KBIT/S LD-CELP: FROM 

CONCEPT TO STANDARD" 

* abstract * 

* page 105, right-hand column, line 17 - 
line 26 * 

EP 0 459 358 A (NIPPON ELECTRIC CO) 4 
December 1991 

* abstract; claim 1 * 

* page 3, line 7 - line 50 * 



1,7.8,13 G10L9/14 



2.4.14 
1 

1.13 
1,13 



TECHNICAL FIELDS 
SEARCHED aot.a.«> 



GIOL 



The present srardi report has been drawn up for aU daims 



8 
2 

3 

3 



Ran af m 

THE HAGUE 



14 July 1997 



Van Doremalen, J 



CATEGORY OF QTED DOCUMENTS 

X : partlculariy ralcvaat tf takan aloM 

Y : yarticnlariy rdevaat tf coabtnad with inotlitr 

tocuatcnt of tbe suae catc^oty 
A : Cechcologlcal bachigroaai 
O : lum-mittcB tfisdosiire 
P : iatcnacdiatc 4ocunc0t 



T : tbeoiy or priad^c Datferfying the InvcatioB 
E : earlier patent dotomcDt, but published on, or 

aftar tha filing dxtm 
D : docuncnt dtad in the appUcmtioa 
L : dacunicnt dted for other rcmsaos 

& : ncnber of the same patcot fuaily, cofrespondhig 
docoment 



• BNSDOCia <EP ^07476B4A3_l_> 



